Genetic variants in the TCF7L2 gene as diagnostic markers for risk of type 2 diabetes mellitus

ABSTRACT

Polymorphisms in the gene TCF7L2 are shown by association analysis to be a susceptibility gene for type II diabetes. Methods of diagnosis of susceptibility to diabetes, of decreased susceptibility to diabetes and protection against diabetes, are described, as are methods of treatment for type II diabetes.

RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No.11/454,296, filed Jun. 16, 2006, which claims the benefit of U.S.Provisional Application No. 60/757,155, filed on Jan. 6, 2006 and U.S.Provisional Application No. 60/692,174, filed on Jun. 20, 2005. Theentire teachings of the above applications are incorporated herein byreference.

BACKGROUND OF THE INVENTION

Diabetes mellitus, a metabolic disease wherein carbohydrate utilizationis reduced and lipid and protein utilization is enhanced, is caused byan absolute or relative deficiency of insulin. In the more severe cases,diabetes is characterized by chronic hyperglycemia, glycosuria, waterand electrolyte loss, ketoacidosis and coma. Long term complicationsinclude development of neuropathy, retinopathy, nephropathy, generalizeddegenerative changes in large and small blood vessels and increasedsusceptibility to infection. The most common form of diabetes is TypeII, non-insulin-dependent diabetes that is characterized byhyperglycemia due to impaired insulin secretion and insulin resistancein target tissues. Both genetic and environmental factors contribute tothe disease. For example, obesity plays a major role in the developmentof the disease. Type II diabetes is often a mild form of diabetesmellitus of gradual onset.

The health implications of Type II diabetes are enormous. In 1995, therewere 135 million adults with diabetes worldwide. It is estimated thatclose to 300 million will have diabetes in the year 2025. (King H., etal., Diabetes Care, 21(9): 1414-1431 (1998)). The prevalence of Type IIdiabetes in the adult population in Iceland is 2.5% (Vilbergsson, S., etal., Diabet. Med., 14(6): 491-498 (1997)), which comprises approximately5,000 people over the age of 34 who have the disease. The highprevalence of the disease and increasing population affected shows anunmet medical need to define the genetic factors involved in Type IIdiabetes to more precisely define the associated risk factors. Alsoneeded are therapeutic agents for prevention of Type II diabetes.

SUMMARY OF THE INVENTION

The present invention relates to methods of diagnosing an increasedsusceptibility to type II diabetes, as well as methods of diagnosing adecreased susceptibility to type II diabetes or diagnosing a protectionagainst type II diabetes, by evaluating certain markers or haplotypesrelating to the TCF7L2 gene (transcription factor 7-like 2 (T-cellspecific, HMG-box), previously referred to as the TCF4 gene (T-celltranscription factor 4)). The methods comprise detecting a geneticmarker associated with the exon 4 LD block of TCF7L2 gene.

In a first aspect, the invention relates to a method of diagnosing asusceptibility to type II diabetes in an individual, comprisinganalyzing a nucleic acid sample obtained from the individual for amarker or haplotype associated with the exon 4 LD block of TCF7L2,wherein the presence of the marker or haplotype is indicative of asusceptibility to type II diabetes. In one embodiment, the marker orhaplotype comprises at least one marker selected from the markers listedin Table 6. In another embodiment, the marker or haplotype is a marker.

In one preferred embodiment, the marker or haplotype is indicative ofincreased susceptibility of type II diabetes. The increasedsusceptibility is in one embodiment characterized by a relative risk ofat least 1.2, including a relative risk of at least 1.3 and a relativerisk of at least 1.4. In one embodiment, the marker is selected from thegroup consisting of DG10S478, rs12255372, rs7895340, rs11196205,rs7901695, rs7903146, rs12243326, and rs4506565, and wherein thepresence of a non-0 allele (e.g., −4, 4, 8, 12, 16, 20, or other non-0allele) in DG10S478, a T allele in rs12255372; an A allele in rs7895340;a C allele in rs11196205; a C allele in rs7901695; a T allele inrs7903146; a C allele in rs12243326; or an T allele in rs4506565, isindicative of increased susceptibility to type II diabetes. In apreferred embodiment, the marker is selected from the group consistingof DG10S478 and rs7903146, and wherein the presence of a non-0 allele inDG10S478 or a T allele in rs7903146 is indicative of increasedsusceptibility to type II diabetes. In yet another preferred embodiment,the marker is rs7903146, and wherein the presence of a T allele inrs7903146 is indicative of increased susceptibility to type II diabetes.

In another preferred embodiment, the marker or haplotype is indicativeof decreased susceptibility of type II diabetes. The decreasedsusceptibility is in one embodiment characterized by a relative risk ofless than 0.8, including a relative risk of less than 0.7. In oneembodiment, the marker is selected from the group consisting ofDG10S478, rs12255372, rs7895340, rs11196205, rs7901695, rs7903146,rs12243326, and rs4506565, and wherein the presence of a 0 allele inDG10S478, a G allele in SNP rs12255372; a G allele in rs7895340; a Gallele in rs11196205; a T allele in rs7901695; a C allele in rs7903146;a T allele in rs12243326; or an A allele in rs4506565 is indicative of adecreased susceptibility to type II diabetes. In a preferred embodiment,the marker is DG10S478, and wherein the presence of a 0 allele inDG10S478 is indicative of decreased susceptibility to type II diabetes.In another preferred embodiment, the marker is rs7903146, and whereinthe presence of a C allele in rs7903146 is indicative of decreasedsusceptibility to type II diabetes.

In a second aspect, the present invention relates to a kit for assayinga sample from an individual to detect a susceptibility to type IIdiabetes, wherein the kit comprises one or more reagents for detectingone or more markers associated with the exon 4 LD block of TCF7L2. Inone embodiment, the one or more reagents comprise at least onecontiguous nucleotide sequence that is completely complementary to aregion comprising at least one marker associated with the exon 4 LDblock of TCF7L2. In one embodiment, the one or markers is selected fromthe group consisting of DG10S478, rs12255372, rs7895340, rs11196205,rs7901695, rs7903146, rs12243326, and rs4506565. In a preferredembodiment, the one or more marker is DG10S478 or rs7903146. In anotherpreferred embodiment, the marker is the C allele in rs7903146.

In another aspect, the present invention relates to a method ofassessing an individual for probability of response to a TCF7L2therapeutic agent, comprising: detecting a marker associated with theexon 4 LD block of TCF7L2, wherein the presence of the marker isindicative of a probability of a positive response to a TCF7L2therapeutic agent. In one embodiment, the marker is selected from thegroup consisting of DG10S478, rs12255372, rs7895340, rs11196205,rs7901695, rs7903146, rs12243326, and rs4506565. In another embodiment,the marker is marker DG10S478 or marker rs7903146, and wherein thepresence of a non-0 allele in DG10S478 or a T allele in rs7903146 isindicative of a probability of a positive response to a TCF7L2therapeutic agent.

Another aspect of the invention relates to the use of a TCF7L2therapeutic agent for the manufacture of a medicament for the treatmentof type II diabetes. In one embodiment, the TCF7L2 therapeutic agent isan agent that alters activity in the Wnt signaling pathway or in thecadherin pathway. In another embodiment, the TCF7L2 therapeutic agent isan agent selected from the group set forth in the Agent Table.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing and other objects, features and advantages of theinvention will be apparent from the following more particulardescription of preferred embodiments of the invention. The patent orapplication file contains at least one drawing executed in color. Copiesof this patent or patent application publication with color drawingswill be provided by the Office upon request and payment of the necessaryfee.

The FIGURE depicts the TCF7L2 region of interest with respect to linkagedisequilibrium (LD) of SNPs in HapMap project Build 16. The 215.9 kbgene spans seven LD blocks as indicated by the black arrow schematic(based on NCBI RefSeq) which shows the direction of transcription; exonsare indicated, with exon 4 highlighted. DG10S478 is located at 114.46 Mbon chromosome 10 (NCBI Build 34) in intron 3 of the TCF7L2 gene, withina 74.9 kb block that incorporates part of intron 3, the whole of exon 4and part of intron 4 (herein referred to as the “exon 4 LD block ofTCF7L2”). The SNP markers are plotted equidistantly rather thanaccording to their physical positions. The FIGURE shows two measures ofLD—i.e. D′ (upper left part of FIGURE) and r² (lower right part).

DETAILED DESCRIPTION OF THE INVENTION

A description of preferred embodiments of the invention follows.

Loci Associated with Type II Diabetes

Type II diabetes is characterized by hyperglycemia, which can occurthrough mechanisms such as impaired insulin secretion, insulinresistance in peripheral tissues and increased glucose output by theliver. Most type II diabetes patients suffer serious complications ofchronic hyperglycemia including nephropathy, neuropathy, retinopathy andaccelerated development of cardiovascular disease. The prevalence oftype II diabetes worldwide is currently 6% but is projected to rise overthe next decade(1). This increase in prevalence of type II diabetes isattributed to increasing age of the population and rise in obesity.

There is evidence for a genetic component to the risk of type IIdiabetes, including prevalence differences between various racialgroups(2, 3), higher concordance rates among monozygotic than dizygotictwins(4, 5) and a sibling relative risk (λ_(s)) for type II diabetes inEuropean populations of approximately 3.5(6).

Two approaches have thus far been used to search for genes associatedwith type II diabetes. Single nucleotide polymorphisms (SNPs) withincandidate genes have been tested for association and have, in general,not been replicated or confer only a modest risk of type II diabetes—themost widely reported being a protective Pro12Ala polymorphism in theperoxisome proliferator activated receptor gamma gene (PPARG2)(7) and anat risk polymorphism in the potassium inwardly-rectifying channel,subfamily J, member 11 gene (KIR6.2)(8).

Genome-wide linkage scans in families with the common form of type IIdiabetes have yielded several loci, and the primary focus ofinternational research consortia has been on loci on chromosomes 1, 12and 20 observed in many populations(6). The genes in these loci have yetto be uncovered. However, in Mexican Americans, the calpain 10 (CAPN10)gene was isolated out of a locus on chromosome 2q; this represents theonly gene for the common form of type II diabetes to date to beidentified through positional cloning (9). The rare Mendelian forms oftype II diabetes, namely maturity-onset diabetes of the young (MODY),have yielded six genes by positional cloning(6).

We previously reported genome-wide significant linkage to chromosome 5qfor type II diabetes mellitus in the Icelandic population(10); in thesame study, we also reported suggestive evidence of linkage to 10q and12q. Linkage to the 10q region has also been observed in MexicanAmericans(11).

Transcription Factor 7-Like 2 Gene (TCF7L2) Association with Type IIDiabetes

The present invention relates to identification of a type IIdiabetes-associated LD block (“exon 4 LD block of TCF7L2”) within thegene encoding T-cell transcription factor 4 (TCF4—official gene symbolTCF7L2). Several markers within the exon 4 LD block of TCF7L2, includingmicrosatellite DG10S478 and SNP markers rs7903146 and rs12255372, havebeen found to be associated with type II diabetes. The originalobservation, first found in an Icelandic cohort, of the association ofDG10S478 (P=1.3×10⁻⁹; Relative risk=1.45; Population attributablerisk=22.7%), has subsequently been replicated in a Danish type IIdiabetes cohort and a United States Caucasian cohort. DG10S478 islocated in intron 3 of the TCF7L2 gene on 10q25.2 and within a welldefined LD block of 74.9 kb that encapsulates part of intron 3, thewhole of exon 4 and part of intron 4. The TCF7L2 gene product is a highmobility group (HMG) box-containing transcription factor that plays arole in the Wnt signaling pathway, also known as the APC3/β-catenin/TCFpathway. TCF7L2 mediates the cell type-specific regulation ofproglucagon gene expression (a key player in blood glucose homeostasis)through the Wnt pathway members β-catenin and glycogen synthasekinase-3beta(12). In addition, Wnt signaling maintains preadipocytes inan undifferentiated state through inhibition of the adipogenictranscription factors CCAAT/enhancer binding protein alpha (C/EBPalpha)and peroxisome proliferator-activated receptor gamma (PPARgamma)(13).When Wnt signaling in preadipocytes is prevented by overexpression ofdominant-negative TCF7L2, these cells differentiate into adipocytes(13).In addition, it has been reported that the Wnt/β-catenin signalingpathway targets PPARgamma activity through physical interaction withβ-catenin and TCF7L2 in colon cancer cells(14). The multifunctionalβ-catenin protein is also important for mediating cell adhesion throughits binding of cadherins(15).

As a result of this discovery, methods are now available for diagnosisof a susceptibility to type II diabetes, as well as for diagnosis of adecreased susceptibility to type II diabetes and/or a protection againsttype II diabetes. In preferred embodiments of the invention, diagnosticassays are used to identify the presence of particular alleles,including a 0 allele in marker DG10S478 (associated with a decreasedsusceptibility to type II diabetes and is an allele that is protectiveagainst type II diabetes); a non-0 allele (e.g., −4, 4, 8, 12, 16 or 20,or other allele) in marker DG10S478 (associated with susceptibility totype II diabetes); a G allele in SNP rs12255372 (associated with adecreased susceptibility to type II diabetes and is an allele that isprotective against type II diabetes): a T allele in SNP rs12255372(associated with susceptibility to type II diabetes); a G allele in SNPrs7895340 (associated with a decreased susceptibility to type IIdiabetes and is an allele that is protective against type II diabetes);an A allele in SNP rs7895340 (associated with susceptibility to type IIdiabetes); a G allele in SNP rs11196205 (associated with a decreasedsusceptibility to type II diabetes and is an allele that is protectiveagainst type II diabetes); a C allele in SNP rs11196205 (associated withsusceptibility to type II diabetes); a T allele in SNP rs7901695(associated with a decreased susceptibility to type II diabetes and isan allele that is protective against type II diabetes); a C allele inSNP rs7901695 (associated with susceptibility to type II diabetes); a Callele in SNP rs7903146 (associated with a decreased susceptibility totype II diabetes and is an allele that is protective against type IIdiabetes); a T allele in SNP rs7903146 (associated with a susceptibilityto type II diabetes); a C allele in SNP rs12243326 (associated with asusceptibility to type II diabetes); and an T allele in SNP rs4506565(associated with a susceptibility to type II diabetes). In additionalembodiments of the invention, other markers or SNPs, identified usingthe methods described herein, can be used for diagnosis of asusceptibility to type II diabetes, and also for diagnosis of adecreased susceptibility to type II diabetes or for identification of anallele that is protective against type II diabetes. The diagnosticassays presented below can be used to identify the presence or absenceof these particular alleles.

Diagnostic Assays

Nucleic acids, probes, primers, and antibodies such as those describedherein can be used in a variety of methods of diagnosis of asusceptibility to type II diabetes, as well as in kits (e.g., useful fordiagnosis of a susceptibility to type II diabetes). Similarly, thenucleic acids, probes, primers, and antibodies described herein can beused in methods of diagnosis of a decreased susceptibility to type IIdiabetes, as well as in methods of diagnosis of a protection againsttype II diabetes, and also in kits). In one aspect, the kit comprisesprimers that can be used to amplify the markers of interest.

In one aspect of the invention, diagnosis of a susceptibility to type IIdiabetes is made by detecting a polymorphism in a TCF7L2 nucleic acid asdescribed herein (e.g., the alleles in marker DG10S478 or in SNPrs12255372, rs7895340, rs11196205, rs7901695, rs7903146, rs12243326,rs4506565). The polymorphism can be a change in a TCF7L2 nucleic acid,such as the insertion or deletion of a single nucleotide, or of morethan one nucleotide, resulting in a frame shift; the change of at leastone nucleotide, resulting in a change in the encoded amino acid; thechange of at least one nucleotide, resulting in the generation of apremature stop codon; the deletion of several nucleotides, resulting ina deletion of one or more amino acids encoded by the nucleotides; theinsertion of one or several nucleotides, such as by unequalrecombination or gene conversion, resulting in an interruption of thecoding sequence of the gene; duplication of all or a part of the gene;transposition of all or a part of the gene; or rearrangement of all or apart of the gene. More than one such change may be present in a singlegene. Such sequence changes cause a difference in the polypeptideencoded by a TCF7L2 nucleic acid. For example, if the difference is aframe shift change, the frame shift can result in a change in theencoded amino acids, and/or can result in the generation of a prematurestop codon, causing generation of a truncated polypeptide.Alternatively, a polymorphism associated with a disease or condition ora susceptibility to a disease or condition associated with a TCF7L2nucleic acid can be a synonymous alteration in one or more nucleotides(i.e., an alteration that does not result in a change in the polypeptideencoded by a TCF7L2 nucleic acid). Such a polymorphism may altersplicing sites, affect the stability or transport of mRNA, or otherwiseaffect the transcription or translation of the gene. A TCF7L2 nucleicacid that has any of the changes or alterations described above isreferred to herein as an “altered nucleic acid.”

In a first method of diagnosing a susceptibility to type II diabetes,hybridization methods, such as Southern analysis, Northern analysis, orin situ hybridizations, can be used (see Current Protocols in MolecularBiology, Ausubel, F. et al., eds, John Wiley & Sons, including allsupplements through 1999). For example, a biological sample (a “testsample”) from a test subject (the “test individual”) of genomic DNA,RNA, or cDNA, is obtained from an individual (RNA and cDNA can only beused for exonic markers), such as an individual suspected of having,being susceptible to or predisposed for, or carrying a defect for, typeII diabetes. The individual can be an adult, child, or fetus. The testsample can be from any source which contains genomic DNA, such as ablood sample, sample of amniotic fluid, sample of cerebrospinal fluid,or tissue sample from skin, muscle, buccal or conjunctival mucosa,placenta, gastrointestinal tract or other organs. A test sample of DNAfrom fetal cells or tissue can be obtained by appropriate methods, suchas by amniocentesis or chorionic villus sampling. The DNA, RNA, or cDNAsample is then examined to determine whether a polymorphism in a TCF7L2nucleic acid is present, and/or to determine which splicing variant(s)encoded by the TCF7L2 is present. The presence of the polymorphism orsplicing variant(s) can be indicated by hybridization of the gene in thegenomic DNA, RNA, or cDNA to a nucleic acid probe. A “nucleic acidprobe”, as used herein, can be a DNA probe or an RNA probe; the nucleicacid probe can contain, for example, at least one polymorphism in aTCF7L2 nucleic acid and/or contain a nucleic acid encoding a particularsplicing variant of a TCF7L2 nucleic acid. The probe can be any of thenucleic acid molecules described above (e.g., the gene or nucleic acid,a fragment, a vector comprising the gene or nucleic acid, a probe orprimer, etc.).

To diagnose a susceptibility to type II diabetes, a hybridization samplecan be formed by contacting the test sample containing a TCF7L2 nucleicacid with at least one nucleic acid probe. A preferred probe fordetecting mRNA or genomic DNA is a labeled nucleic acid probe capable ofhybridizing to mRNA or genomic DNA sequences described herein. Thenucleic acid probe can be, for example, a full-length nucleic acidmolecule, or a portion thereof, such as an oligonucleotide of at least15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient tospecifically hybridize under stringent conditions to appropriate mRNA orgenomic DNA. Suitable probes for use in the diagnostic assays of theinvention are described above (see e.g., probes and primers discussedunder the heading, “Nucleic Acids of the Invention”).

The hybridization sample is maintained under conditions that aresufficient to allow specific hybridization of the nucleic acid probe toa TCF7L2 nucleic acid. “Specific hybridization”, as used herein,indicates exact hybridization (e.g., with no mismatches). Specifichybridization can be performed under high stringency conditions ormoderate stringency conditions, for example, as described above. In aparticularly preferred aspect, the hybridization conditions for specifichybridization are high stringency.

Specific hybridization, if present, is then detected using standardmethods. If specific hybridization occurs between the nucleic acid probeand TCF7L2 nucleic acid in the test sample, then the TCF7L2 has thepolymorphism, or is the splicing variant, that is present in the nucleicacid probe. More than one nucleic acid probe can also be usedconcurrently in this method. Specific hybridization of any one of thenucleic acid probes is indicative of a polymorphism in the TCF7L2nucleic acid, or of the presence of a particular splicing variantencoding the TCF7L2 nucleic acid and can be diagnostic for asusceptibility to type II diabetes, or for a decreased susceptibility totype II diabetes (or indicative of a protective allele against type IIdiabetes).

In Northern analysis (see Current Protocols in Molecular Biology,Ausubel, F. et al., eds., John Wiley & Sons, supra) the hybridizationmethods described above are used to identify the presence of apolymorphism or a particular splicing variant, associated with asusceptibility to type II diabetes or associated with a decreasedsusceptibility to type II diabetes. For Northern analysis, a test sampleof RNA is obtained from the individual by appropriate means. Specifichybridization of a nucleic acid probe, as described above, to RNA fromthe individual is indicative of a polymorphism in a TCF7L2 nucleic acid,or of the presence of a particular splicing variant encoded by a TCF7L2nucleic acid and is therefore diagnostic for the susceptibility to typeII diabetes or the decreased susceptibility to type II diabetes (orindicative of a protective allele against type II diabetes).

For representative examples of use of nucleic acid probes, see, forexample, U.S. Pat. Nos. 5,288,611 and 4,851,330.

Alternatively, a peptide nucleic acid (PNA) probe can be used instead ofa nucleic acid probe in the hybridization methods described above. PNAis a DNA mimic having a peptide-like, inorganic backbone, such asN-(2-aminoethyl) glycine units, with an organic base (A, G, C, T or U)attached to the glycine nitrogen via a methylene carbonyl linker (see,for example, Nielsen, P. E. et al., Bioconjugate Chemistry 5, AmericanChemical Society, p. 1 (1994). The PNA probe can be designed tospecifically hybridize to a TCF7L2 nucleic acid. Hybridization of thePNA probe to a TCF7L2 nucleic acid can be diagnostic for asusceptibility to type II diabetes or decreased susceptibility to typeII diabetes (or indicative of a protective allele against type IIdiabetes).

In another method of the invention, alteration analysis by restrictiondigestion can be used to detect an alteration in the gene, if thealteration (mutation) or polymorphism in the gene results in thecreation or elimination of a restriction site. A test sample containinggenomic DNA is obtained from the individual. Polymerase chain reaction(PCR) can be used to amplify a TCF7L2 nucleic acid (and, if necessary,the flanking sequences) in the test sample of genomic DNA from the testindividual. RFLP analysis is conducted as described (see CurrentProtocols in Molecular Biology, supra). The digestion pattern of therelevant DNA fragment indicates the presence or absence of thealteration or polymorphism in the TCF7L2 nucleic acid, and thereforeindicates the presence or absence a susceptibility to type II diabetesor a decreased susceptibility to type II diabetes (or indicative of aprotective allele against type II diabetes).

Sequence analysis can also be used to detect specific polymorphisms in aTCF7L2 nucleic acid. A test sample of DNA or RNA is obtained from thetest individual. PCR or other appropriate methods can be used to amplifythe gene or nucleic acid, and/or its flanking sequences, if desired. Thesequence of a TCF7L2 nucleic acid, or a fragment of the nucleic acid, orcDNA, or fragment of the cDNA, or mRNA, or fragment of the mRNA, isdetermined, using standard methods. The sequence of the nucleic acid,nucleic acid fragment, cDNA, cDNA fragment, mRNA, or mRNA fragment iscompared with the known nucleic acid sequence of the gene or cDNA ormRNA, as appropriate. The presence of a polymorphism in the TCF7L2indicates that the individual has a susceptibility to type II diabetesor a decreased susceptibility to type II diabetes (or indicative of aprotective allele against type II diabetes).

Allele-specific oligonucleotides can also be used to detect the presenceof a polymorphism in a TCF7L2 nucleic acid, through the use of dot-blothybridization of amplified oligonucleotides with allele-specificoligonucleotide (ASO) probes (see, for example, Saiki, R. et al., Nature324:163-166 (1986)). An “allele-specific oligonucleotide” (also referredto herein as an “allele-specific oligonucleotide probe”) is anoligonucleotide of approximately 10-50 base pairs, preferablyapproximately 15-30 base pairs, that specifically hybridizes to a TCF7L2nucleic acid, and that contains a polymorphism associated with asusceptibility to type II diabetes or a polymorphism associated with adecreased susceptibility to type II diabetes (or indicative of aprotective allele against type II diabetes). An allele-specificoligonucleotide probe that is specific for particular polymorphisms in aTCF7L2 nucleic acid can be prepared, using standard methods (see CurrentProtocols in Molecular Biology, supra). To identify polymorphisms in thegene that are associated with type II diabetes, a test sample of DNA isobtained from the individual. PCR can be used to amplify all or afragment of a TCF7L2 nucleic acid and its flanking sequences. The DNAcontaining the amplified TCF7L2 nucleic acid (or fragment of the gene ornucleic acid) is dot-blotted, using standard methods (see CurrentProtocols in Molecular Biology, supra), and the blot is contacted withthe oligonucleotide probe. The presence of specific hybridization of theprobe to the amplified TCF7L2 nucleic acid is then detected.Hybridization of an allele-specific oligonucleotide probe to DNA fromthe individual is indicative of a polymorphism in the TCF7L2 nucleicacid, and is therefore indicative of susceptibility to type II diabetesor is indicative of decreased susceptibility to type II diabetes (orindicative of a protective allele against type II diabetes).

The invention further provides allele-specific oligonucleotides thathybridize to the reference or variant allele of a gene or nucleic acidcomprising a single nucleotide polymorphism or to the complementthereof. These oligonucleotides can be probes or primers.

An allele-specific primer hybridizes to a site on target DNA overlappinga polymorphism and only primes amplification of an allelic form to whichthe primer exhibits perfect complementarity. See Gibbs, Nucleic AcidRes. 17, 2427-2448 (1989). This primer is used in conjunction with asecond primer, which hybridizes at a distal site. Amplification proceedsfrom the two primers, resulting in a detectable product, which indicatesthe particular allelic form is present. A control is usually performedwith a second pair of primers, one of which shows a single base mismatchat the polymorphic site and the other of which exhibits perfectcomplementarity to a distal site. The single-base mismatch preventsamplification and no detectable product is formed. The method works bestwhen the mismatch is included in the 3′-most position of theoligonucleotide aligned with the polymorphism because this position ismost destabilizing to elongation from the primer (see, e.g., WO93/22456).

With the addition of such analogs as locked nucleic acids (LNAs), thesize of primers and probes can be reduced to as few as 8 bases. LNAs area novel class of bicyclic DNA analogs in which the 2′ and 4′ positionsin the furanose ring are joined via an O-methylene (oxy-LNA),S-methylene (thio-LNA), or amino methylene (amino-LNA) moiety. Common toall of these LNA variants is an affinity toward complementary nucleicacids, which is by far the highest reported for a DNA analog. Forexample, particular all oxy-LNA nonamers have been shown to have meltingtemperatures of 64EC and 74EC when in complex with complementary DNA orRNA, respectively, as opposed to 28EC for both DNA and RNA for thecorresponding DNA nonamer. Substantial increases in T_(m) are alsoobtained when LNA monomers are used in combination with standard DNA orRNA monomers. For primers and probes, depending on where the LNAmonomers are included (e.g., the 3′ end, the 5′ end, or in the middle),the T_(m) could be increased considerably.

In another aspect, arrays of oligonucleotide probes that arecomplementary to target nucleic acid sequence segments from anindividual can be used to identify polymorphisms in a TCF7L2 nucleicacid. For example, in one aspect, an oligonucleotide array can be used.Oligonucleotide arrays typically comprise a plurality of differentoligonucleotide probes that are coupled to a surface of a substrate indifferent known locations. These oligonucleotide arrays, also describedas “Genechips™,” have been generally described in the art, for example,U.S. Pat. No. 5,143,854 and PCT patent publication Nos. WO 90/15070 and92/10092. These arrays can generally be produced using mechanicalsynthesis methods or light directed synthesis methods that incorporate acombination of photolithographic methods and solid phase oligonucleotidesynthesis methods. See Fodor et al., Science 251:767-777 (1991), Pirrunget al., U.S. Pat. No. 5,143,854 (see also PCT Application No. WO90/15070) and Fodor et al., PCT Publication No. WO 92/10092 and U.S.Pat. No. 5,424,186, the entire teachings are incorporated by referenceherein. Techniques for the synthesis of these arrays using mechanicalsynthesis methods are described in, e.g., U.S. Pat. No. 5,384,261; theentire teachings are incorporated by reference herein. In anotherexample, linear arrays can be utilized.

Once an oligonucleotide array is prepared, a nucleic acid of interest ishybridized with the array and scanned for polymorphisms. Hybridizationand scanning are generally carried out by methods described herein andalso in, e.g., published PCT Application Nos. WO 92/10092 and WO95/11995, and U.S. Pat. No. 5,424,186, the entire teachings areincorporated by reference herein. In brief, a target nucleic acidsequence that includes one or more previously identified polymorphicmarkers is amplified by well-known amplification techniques, e.g., PCR.Typically, this involves the use of primer sequences that arecomplementary to the two strands of the target sequence both upstreamand downstream from the polymorphism. Asymmetric PCR techniques may alsobe used. Amplified target, generally incorporating a label, is thenhybridized with the array under appropriate conditions. Upon completionof hybridization and washing of the array, the array is scanned todetermine the position on the array to which the target sequencehybridizes. The hybridization data obtained from the scan is typicallyin the form of fluorescence intensities as a function of location on thearray.

Although primarily described in terms of a single detection block, e.g.,for detecting a single polymorphism, arrays can include multipledetection blocks, and thus be capable of analyzing multiple, specificpolymorphisms. In alternative aspects, it will generally be understoodthat detection blocks may be grouped within a single array or inmultiple, separate arrays so that varying, optimal conditions may beused during the hybridization of the target to the array. For example,it may often be desirable to provide for the detection of thosepolymorphisms that fall within G-C rich stretches of a genomic sequence,separately from those falling in A-T rich segments. This allows for theseparate optimization of hybridization conditions for each situation.Additional uses of oligonucleotide arrays for polymorphism detection canbe found, for example, in U.S. Pat. Nos. 5,858,659 and 5,837,832, theentire teachings of which are incorporated by reference herein. Othermethods of nucleic acid analysis can be used to detect polymorphisms ina type II diabetes gene or variants encoded by a type II diabetes gene.Representative methods include direct manual sequencing (Church andGilbert, Proc. Natl. Acad. Sci. USA 81:1991-1995 (1988); Sanger, F. etal., Proc. Natl. Acad. Sci. USA 74:5463-5467 (1977); Beavis et al., U.S.Pat. No. 5,288,644); automated fluorescent sequencing; single-strandedconformation polymorphism assays (SSCP); clamped denaturing gelelectrophoresis (CDGE); denaturing gradient gel electrophoresis (DGGE)(Sheffield, V. C. et al., Proc. Natl. Acad. Sci. USA 86:232-236 (1989)),mobility shift analysis (Orita, M. et al., Proc. Natl. Acad. Sci. USA86:2766-2770 (1989)), restriction enzyme analysis (Flavell et al., Cell15:25 (1978); Geever, et al., Proc. Natl. Acad. Sci. USA 78:5081(1981)); heteroduplex analysis; chemical mismatch cleavage (CMC) (Cottonet al., Proc. Natl. Acad. Sci. USA 85:4397-4401 (1985)); RNaseprotection assays (Myers, R. M. et al., Science 230:1242 (1985)); use ofpolypeptides which recognize nucleotide mismatches, such as E. coli mutSprotein; allele-specific PCR, for example.

In one aspect of the invention, diagnosis of a susceptibility to type IIdiabetes, or of a decreased susceptibility to type II diabetes (orindicative of a protective allele against type II diabetes), can also bemade by expression analysis by quantitative PCR (kinetic thermalcycling). This technique, utilizing TaqMan® assays, can assess thepresence of an alteration in the expression or composition of thepolypeptide encoded by a TCF7L2 nucleic acid or splicing variantsencoded by a TCF7L2 nucleic acid. TaqMan® probes can also be used toallow the identification of polymorphisms and whether a patient ishomozygous or heterozygous. Further, the expression of the variants canbe quantified as physically or functionally different.

In another aspect of the invention, diagnosis of a susceptibility totype II diabetes or of a decreased susceptibility to type II diabetes(or indicative of a protective allele against type II diabetes), can bemade by examining expression and/or composition of a TCF7L2 polypeptide,by a variety of methods, including enzyme linked immunosorbent assays(ELISAs), Western blots, immunoprecipitations and immunofluorescence. Atest sample from an individual is assessed for the presence of analteration in the expression and/or an alteration in composition of thepolypeptide encoded by a TCF7L2 nucleic acid, or for the presence of aparticular variant encoded by a TCF7L2 nucleic acid. An alteration inexpression of a polypeptide encoded by a TCF7L2 nucleic acid can be, forexample, an alteration in the quantitative polypeptide expression (i.e.,the amount of polypeptide produced); an alteration in the composition ofa polypeptide encoded by a TCF7L2 nucleic acid is an alteration in thequalitative polypeptide expression (e.g., expression of an alteredTCF7L2 polypeptide or of a different splicing variant). In a preferredaspect, diagnosis of a susceptibility to type II diabetes or of adecreased susceptibility to type II diabetes can be made by detecting aparticular splicing variant encoded by that TCF7L2 nucleic acid, or aparticular pattern of splicing variants.

Both such alterations (quantitative and qualitative) can also bepresent. The term “alteration” in the polypeptide expression orcomposition, as used herein, refers to an alteration in expression orcomposition in a test sample, as compared with the expression orcomposition of polypeptide by a TCF7L2 nucleic acid in a control sample.A control sample is a sample that corresponds to the test sample (e.g.,is from the same type of cells), and is from an individual who is notaffected by a susceptibility to type II diabetes. An alteration in theexpression or composition of the polypeptide in the test sample, ascompared with the control sample, is indicative of a susceptibility totype II diabetes. Similarly, the presence of one or more differentsplicing variants in the test sample, or the presence of significantlydifferent amounts of different splicing variants in the test sample, ascompared with the control sample, is indicative of a susceptibility totype II diabetes. Various means of examining expression or compositionof the polypeptide encoded by a TCF7L2 nucleic acid can be used,including: spectroscopy, colorimetry, electrophoresis, isoelectricfocusing, and immunoassays (e.g., David et al., U.S. Pat. No. 4,376,110)such as immunoblotting (see also Current Protocols in Molecular Biology,particularly Chapter 10). For example, in one aspect, an antibodycapable of binding to the polypeptide (e.g., as described above),preferably an antibody with a detectable label, can be used. Antibodiescan be polyclonal, or more preferably, monoclonal. An intact antibody,or a fragment thereof (e.g., Fab or F(ab′)₂) can be used. The term“labeled”, with regard to the probe or antibody, is intended toencompass direct labeling of the probe or antibody by coupling (i.e.,physically linking) a detectable substance to the probe or antibody, aswell as indirect labeling of the probe or antibody by reactivity withanother reagent that is directly labeled. Examples of indirect labelinginclude detection of a primary antibody using a fluorescently labeledsecondary antibody and end-labeling a DNA probe with biotin such that itcan be detected with fluorescently labeled streptavidin.

Western blotting analysis, using an antibody as described above thatspecifically binds to a polypeptide encoded by an altered TCF7L2 nucleicacid or an antibody that specifically binds to a polypeptide encoded bya non-altered nucleic acid, or an antibody that specifically binds to aparticular splicing variant encoded by a nucleic acid, can be used toidentify the presence in a test sample of a particular splicing variantor of a polypeptide encoded by a polymorphic or altered TCF7L2 nucleicacid, or the absence in a test sample of a particular splicing variantor of a polypeptide encoded by a non-polymorphic or non-altered nucleicacid. The presence of a polypeptide encoded by a polymorphic or alterednucleic acid, or the absence of a polypeptide encoded by anon-polymorphic or non-altered nucleic acid, is diagnostic for asusceptibility to type II diabetes, as is the presence (or absence) ofparticular splicing variants encoded by the TCF7L2 nucleic acid.

In one aspect of this method, the level or amount of polypeptide encodedby a TCF7L2 nucleic acid in a test sample is compared with the level oramount of the polypeptide encoded by the TCF7L2 in a control sample. Alevel or amount of the polypeptide in the test sample that is higher orlower than the level or amount of the polypeptide in the control sample,such that the difference is statistically significant, is indicative ofan alteration in the expression of the polypeptide encoded by the TCF7L2nucleic acid, and is diagnostic for a susceptibility to type IIdiabetes. Alternatively, the composition of the polypeptide encoded by aTCF7L2 nucleic acid in a test sample is compared with the composition ofthe polypeptide encoded by the TCF7L2 nucleic acid in a control sample(e.g., the presence of different splicing variants). A difference in thecomposition of the polypeptide in the test sample, as compared with thecomposition of the polypeptide in the control sample, is diagnostic fora susceptibility to type II diabetes. In another aspect, both the levelor amount and the composition of the polypeptide can be assessed in thetest sample and in the control sample. A difference in the amount orlevel of the polypeptide in the test sample, compared to the controlsample; a difference in composition in the test sample, compared to thecontrol sample; or both a difference in the amount or level, and adifference in the composition, is indicative of a susceptibility to typeII diabetes.

The same methods can conversely be used to identify the presence of adifference when compared to a control (disease) sample. A differencefrom the control is indicative of a decreased susceptibility todiabetes, and/or is indicative of a protective allele against type IIdiabetes.

Assessment for Markers and Haplotypes

Populations of individuals exhibiting genetic diversity do not haveidentical genomes. Rather, the genome exhibits sequence variabilitybetween individuals at many locations in the genome; in other words,there are many polymorphic sites in a population. In some instances,reference is made to different alleles at a polymorphic site withoutchoosing a reference allele. Alternatively, a reference sequence can bereferred to for a particular polymorphic site. The reference allele issometimes referred to as the “wild-type” allele and it usually is chosenas either the first sequenced allele or as the allele from a“non-affected” individual (e.g., an individual that does not display adisease or abnormal phenotype). Alleles that differ from the referenceare referred to as “variant” alleles.

A “marker”, as described herein, refers to a genomic sequencecharacteristic of a particular variant allele (i.e. polymorphic site).The marker can comprise any allele of any variant type found in thegenome, including SNPs, microsatellites, insertions, deletions,duplications and translocations.

SNP nomenclature as reported herein refers to the official Reference SNP(rs) ID identification tag as assigned to each unique SNP by theNational Center for Biotechnological Information (NCBI).

A “haplotype,” as described herein, refers to a segment of a genomic DNAstrand that is characterized by a specific combination of geneticmarkers (“alleles”) arranged along the segment. In a certain embodiment,the haplotype can comprise one or more alleles, two or more alleles,three or more alleles, four or more alleles, or five or more alleles.The genetic markers are particular “alleles” at “polymorphic sites”associated with the exon 4 LD block of TCF7L2. As used herein, “exon 4LD block of TCF7L2” refers to the LD block on Chr10q whithin whichassociation of variants to type II diabetes is observed. NCBI Build 34position of this LD block is from 114,413,084-114,488,013 bp. The term“susceptibility”, as described herein, encompasses both increasedsusceptibility and decreased susceptibility. Thus, particular markersand/or haplotypes of the invention may be characteristic of increasedsusceptibility of type II diabetes, as characterized by a relative riskof greater than one. Markers and/or haplotypes that confer increasedsusceptibility of type II diabetes are furthermore considered to be“at-risk”, as they confer an increased risk of disease. Alternatively,the markers and/or haplotypes of the invention are characteristic ofdecreased susceptibility of type II diabetes, as characterized by arelative risk of less than one.

A nucleotide position at which more than one sequence is possible in apopulation (either a natural population or a synthetic population, e.g.,a library of synthetic molecules) is referred to herein as a“polymorphic site”. Where a polymorphic site is a single nucleotide inlength, the site is referred to as a single nucleotide polymorphism(“SNP”). For example, if at a particular chromosomal location, onemember of a population has an adenine and another member of thepopulation has a thymine at the same position, then this position is apolymorphic site, and, more specifically, the polymorphic site is a SNP.Alleles for SNP markers as referred to herein refer to the bases A, C, Gor T as they occur at the polymorphic site in the SNP assay employed.The person skilled in the art will realise that by assaying or readingthe opposite strand, the complementary allele can in each case bemeasured. Thus, for a polymorphic site containing an A/G polymorphism,the assay employed may either measure the percentage or ratio of the twobases possible, i.e. A and G. Alternatively, by designing an assay thatdetermines the opposite strand on the DNA template, the percentage orratio of the complementary bases T/C can be measured. Quantitatively(for example, in terms of relative risk), identical results would beobtained from measurement of either DNA strand (+ strand or − strand).Polymorphic sites can allow for differences in sequences based onsubstitutions, insertions or deletions. For example, a polymorphicmicrosatellite has multiple small repeats of bases (such as CA repeats)at a particular site in which the number of repeat lengths varies in thegeneral population. Each version of the sequence with respect to thepolymorphic site is referred to herein as an “allele” of the polymorphicsite. Thus, in the previous example, the SNP allows for both an adenineallele and a thymine allele. SNPs and microsatellite markers locatedwithin the exon 4 LD block of TCF7L2 found to be associated with type IIdiabetes are described in Tables 2-7.

Typically, a reference sequence is referred to for a particularsequence. Alleles that differ from the reference are referred to as“variant” alleles. For example, the reference genomic DNA sequencebetween positions 114413084 and 114488013 of NCBI Build 34 (equals 74929bp, or 74.9 kb), which refers to the location within Chromosome 10, isdescribed herein as SEQ ID NO: 1. A variant sequence, as used herein,refers to a sequence that differs from SEQ ID NO: 1 but is otherwisesubstantially similar. The genetic markers that make up the haplotypesassociated with the exon 4 LD block of TCF7L2 are variants. Additionalvariants can include changes that affect a polypeptide, e.g., apolypeptide encoded by the TCF7L2 gene. These sequence differences, whencompared to a reference nucleotide sequence, can include the insertionor deletion of a single nucleotide, or of more than one nucleotide. Suchsequence differences may result in a frame shift; the change of at leastone nucleotide, may result in a change in the encoded amino acid; thechange of at least one nucleotide, may result in the generation of apremature stop codon; the deletion of several nucleotides, may result ina deletion of one or more amino acids encoded by the nucleotides; theinsertion of one or several nucleotides, such as by unequalrecombination or gene conversion, may result in an interruption of thecoding sequence of a reading frame; duplication of all or a part of asequence; transposition; or a rearrangement of a nucleotide sequence, asdescribed in detail herein. Such sequence changes alter the polypeptideencoded by the nucleic acid. For example, if the change in the nucleicacid sequence causes a frame shift, the frame shift can result in achange in the encoded amino acids, and/or can result in the generationof a premature stop codon, causing generation of a truncatedpolypeptide. Alternatively, a polymorphism associated with type IIdiabetes or a susceptibility to type II diabetes can be a synonymouschange in one or more nucleotides (i.e., a change that does not resultin a change in the amino acid sequence). Such a polymorphism can, forexample, alter splice sites, affect the stability or transport of mRNA,or otherwise affect the transcription or translation of an encodedpolypeptide. It can also alter DNA to increase the possibility thatstructural changes, such as amplifications or deletions, occur at thesomatic level in tumors. The polypeptide encoded by the referencenucleotide sequence is the “reference” polypeptide with a particularreference amino acid sequence, and polypeptides encoded by variantalleles are referred to as “variant” polypeptides with variant aminoacid sequences.

A polymorphic microsatellite has multiple small repeats of bases thatare 2-8 nucleotides in length (such as CA repeats) at a particular site,in which the number of repeat lengths varies in the general population.An indel is a common form of polymorphism comprising a small insertionor deletion that is typically only a few nucleotides long.

The haplotypes described herein are a combination of various geneticmarkers, e.g., SNPs and microsatellites, having particular alleles atpolymorphic sites. The haplotypes can comprise a combination of variousgenetic markers, therefore, detecting haplotypes can be accomplished bymethods known in the art for detecting sequences at polymorphic sites.For example, standard techniques for genotyping for the presence of SNPsand/or microsatellite markers can be used, such as fluorescence-basedtechniques (Chen, X. et al., Genome Res. 9(5): 492-98 (1999)), PCR, LCR,Nested PCR and other techniques for nucleic acid amplification. Thesemarkers and SNPs can be identified in at-risk haplotypes. Certainmethods of identifying relevant markers and SNPs include the use oflinkage disequilibrium (LD) and/or LOD scores.

In certain methods described herein, an individual who is at-risk fortype II diabetes is an individual in whom an at-risk marker or haplotypeis identified. In one aspect, the at-risk marker or haplotype is onethat confers a significant increased risk (or susceptility) of type IIdiabetes. In one embodiment, significance associated with a marker orhaplotype is measured by a relative risk. In a further embodiment, thesignificance is measured by a percentage. In one embodiment, asignificant increased risk is measured as a relative risk of at leastabout 1.2, including but not limited to: 1.2, 1.3, 1.4, 1.5, 1.6, 1.7,1.8 and 1.9. In a further embodiment, a relative risk of at least 1.2 issignificant. In a further embodiment, a relative risk of at least about1.5 is significant. In a further embodiment, a significant increase inrisk is at least about 1.7 is significant. In a further embodiment, asignificant increase in risk is at least about 20%, including but notlimited to about 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%,80%, 85%, 90%, 95% and 98%. In a further embodiment, a significantincrease in risk is at least about 50%.

In other embodiments of the invention, the marker or haplotype confersdecreased risk (decreased susceptibility) of type II diabetes. In oneembodiment, significant decreased risk is measured as a relative risk atless than 0.9, including but not limited to 0.9, 0.8, 0.7, 0.6, 0.5, and0.4. In a further embodiment, significant relative risk is less than0.7. In another embodiment, the decreased in risk (or susceptibility) isat least about 20%, including but not limited to about 25%, 30%, 35%,40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95% and 98%. In afurther embodiment, a significant decrease in risk is at least about30%.

Thus, the term “susceptibility to type II diabetes” indicates either anincreased risk or susceptility or a decreased risk or susceptibility oftype II diabetes, by an amount that is significant, when a certainallele, marker, SNP or haplotype is present; significance is measured asindicated above. The terms “decreased risk”, “decreased susceptibility”and “protection against,” as used herein, indicate that the relativerisk is decreased accordingly when a certain other allele, marker, SNP,and/or a certain other haplotype, is present. It is understood however,that identifying whether an increased or decreased risk is medicallysignificant may also depend on a variety of factors, including thespecific disease, the marker or haplotype, and often, environmentalfactors.

An at-risk marker or haplotype in, or comprising portions of, the TCF7L2gene, is one where the marker or haplotype is more frequently present inan individual at risk for type II diabetes (affected), compared to thefrequency of its presence in a healthy individual (control), and whereinthe presence of the marker or haplotype is indicative of susceptibilityto type II diabetes. As an example of a simple test for correlationwould be a Fisher-exact test on a two by two table. Given a cohort ofchromosomes the two by two table is constructed out of the number ofchromosomes that include both of the markers or haplotypes, one of themarkers or haplotypes but not the other and neither of the markers orhaplotypes.

In certain aspects of the invention, at-risk marker or haplotype is anat-risk marker or haplotype within or near TCF7L2 that significantlycorrelates with type II diabetes. In other aspects, an at-risk marker orhaplotype comprises an at-risk marker or haplotype within or near TCF7L2that significantly correlates with susceptibility to type II diabetes.In particular embodiments of the invention, the marker or haplotype isassociated with the exon 4 LD block of TCF7L2, as described herein.

Standard techniques for genotyping for the presence of SNPs and/ormicrosatellite markers can be used, such as fluorescent based techniques(Chen, et al., Genome Res. 9, 492 (1999)), PCR, LCR, Nested PCR andother techniques for nucleic acid amplification.

In a preferred aspect, the method comprises assessing in an individualthe presence or frequency of SNPs and/or microsatellites in, comprisingportions of, the TCF7L2 gene, wherein an excess or higher frequency ofthe SNPs and/or microsatellites compared to a healthy control individualis indicative that the individual is susceptible to type II diabetes.Such SNPs and markers can form haplotypes that can be used as screeningtools. These markers and SNPs can be identified in at-risk haploptypes.For example, an at-risk haplotype can include microsatellite markersand/or SNPs such as marker DG10S478 and/or SNP rs12255372, rs7895340,rs11196205, rs7901695, rs7903146, rs12243326 or rs4506565. The presenceof an at-risk haplotype is indicative of increased susceptibility totype II diabetes, and therefore is indicative of an individual who fallswithin a target population for the treatment methods described herein.

Identification of Susceptibility Variants

The frequencies of haplotypes in the patient and the control groups canbe estimated using an expectation-maximization algorithm (Dempster A. etal., J. R. Stat. Soc. B, 39:1-38 (1977)). An implementation of thisalgorithm that can handle missing genotypes and uncertainty with thephase can be used. Under the null hypothesis, the patients and thecontrols are assumed to have identical frequencies. Using a likelihoodapproach, an alternative hypothesis is tested, where a candidateat-risk-haplotype, which can include the markers described herein, isallowed to have a higher frequency in patients than controls, while theratios of the frequencies of other haplotypes are assumed to be the samein both groups. Likelihoods are maximized separately under bothhypotheses and a corresponding 1-df likelihood ratio statistic is usedto evaluate the statistical significance.

To look for at-risk and protective markers and haplotypes within alinkage region, for example, association of all possible combinations ofgenotyped markers is studied, provided those markers span a practicalregion. The combined patient and control groups can be randomly dividedinto two sets, equal in size to the original group of patients andcontrols. The marker and haplotype analysis is then repeated and themost significant p-value registered is determined. This randomizationscheme can be repeated, for example, over 100 times to construct anempirical distribution of p-values. In a preferred embodiment, a p-valueof <0.05 is indicative of an significant marker and/or haplotypeassociation.

A detailed discussion of haplotype analysis follows.

Haplotype Analysis

One general approach to haplotype analysis involves usinglikelihood-based inference applied to NEsted MOdels (Gretarsdottir S.,et al., Nat. Genet. 35:131-38 (2003)). The method is implemented in theprogram NEMO, which allows for many polymorphic markers, $NPs andmicrosatellites. The method and software are specifically designed forcase-control studies where the purpose is to identify haplotype groupsthat confer different risks. It is also a tool for studying LDstructures. In NEMO, maximum likelihood estimates, likelihood ratios andp-values are calculated directly, with the aid of the EM algorithm, forthe observed data treating it as a missing-data problem.

Measuring Information

Even though likelihood ratio tests based on likelihoods computeddirectly for the observed data, which have captured the information lossdue to uncertainty in phase and missing genotypes, can be relied on togive valid p-values, it would still be of interest to know how muchinformation had been lost due to the information being incomplete. Theinformation measure for haplotype analysis is described in Nicolae andKong (Technical Report 537, Department of Statistics, University ofStatistics, University of Chicago; Biometrics, 60(2):368-75 (2004)) as anatural extension of information measures defined for linkage analysis,and is implemented in NEMO.

Statistical Analysis

For single marker association to the disease, the Fisher exact test canbe used to calculate two-sided p-values for each individual allele. Allp-values are presented unadjusted for multiple comparisons unlessspecifically indicated. The presented frequencies (for microsatellites,SNPs and haplotypes) are allelic frequencies as opposed to carrierfrequencies. To minimize any bias due the relatedness of the patientswho were recruited as families for the linkage analysis, first andsecond-degree relatives can be eliminated from the patient list.Furthermore, the test can be repeated for association correcting for anyremaining relatedness among the patients, by extending a varianceadjustment procedure described in Risch, N. & Teng, J. (Genome Res.,8:1273-1288 (1998)), DNA pooling (ibid) for sibships so that it can beapplied to general familial relationships, and present both adjusted andunadjusted p-values for comparison. The differences are in general verysmall as expected. To assess the significance of single-markerassociation corrected for multiple testing we can carry out arandomization test using the same genotype data. Cohorts of patients andcontrols can be randomized and the association analysis redone multipletimes (e.g., up to 500,000 times) and the p-value is the fraction ofreplications that produced a p-value for some marker allele that islower than or equal to the p-value we observed using the originalpatient and control cohorts.

For both single-marker and haplotype analyses, relative risk (RR) andthe population attributable risk (PAR) can be calculated assuming amultiplicative model (haplotype relative risk model) (Terwilliger, J. D.& Ott, J., Hum. Hered. 42:337-46 (1992) and Falk, C. T. & Rubinstein, P,Ann. Hum. Genet. 51 (Pt 3):227-33 (1987)), i.e., that the risks of thetwo alleles/haplotypes a person carries multiply. For example, if RR isthe risk of A relative to a, then the risk of a person homozygote AAwill be RR times that of a heterozygote Aa and RR² times that of ahomozygote aa. The multiplicative model has a nice property thatsimplifies analysis and computations-haplotypes are independent, i.e.,in Hardy-Weinberg equilibrium, within the affected population as well aswithin the control population. As a consequence, haplotype counts of theaffecteds and controls each have multinomial distributions, but withdifferent haplotype frequencies under the alternative hypothesis.Specifically, for two haplotypes, h_(i) and h_(j),risk(h_(i))/risk(h_(j))=(f_(i)/p_(i))/(f_(j)/p_(j)), where f and pdenote, respectively, frequencies in the affected population and in thecontrol population. While there is some power loss if the true model isnot multiplicative, the loss tends to be mild except for extreme cases.Most importantly, p-values are always valid since they are computed withrespect to null hypothesis.

Linkage Disequilibrium Using NEMO

LD between pairs of markers can be calculated using the standarddefinition of D′ and R² (Lewontin, R., Genetics 49:49-67 (1964); Hill,W. G. & Robertson, A. Theor. Appl. Genet. 22:226-231 (1968)). UsingNEMO, frequencies of the two marker allele combinations are estimated bymaximum likelihood and deviation from linkage equilibrium is evaluatedby a likelihood ratio test. The definitions of D′ and R² are extended toinclude microsatellites by averaging over the values for all possibleallele combination of the two markers weighted by the marginal alleleprobabilities. When plotting all marker combination to elucidate the LDstructure in a particular region, we plot D′ in the upper left cornerand the p-value in the lower right corner. In the LD plots the markerscan be plotted equidistant rather than according to their physicallocation, if desired.

Statistical Methods for Linkage Analysis

Multipoint, affected-only allele-sharing methods can be used in theanalyses to assess evidence for linkage. Results, both the LOD-score andthe non-parametric linkage (NPL) score, can be obtained using theprogram Allegro (Gudbjartsson et al., Nat. Genet. 25:12-3 (2000)). Ourbaseline linkage analysis uses the Spairs scoring function (Whittemore,A. S., Halpern, J. Biometrics 50:118-27 (1994); Kruglyak L. et al., Am.J. Hum. Genet. 58:1347-63 (1996)), the exponential allele-sharing model(Kong, A. and Cox, N.J., Am. J. Hum. Genet. 61:1179-88 (1997)) and afamily weighting scheme that is halfway, on the log-scale, betweenweighting each affected pair equally and weighting each family equally.The information measure that we use is part of the Allegro programoutput and the information value equals zero if the marker genotypes arecompletely uninformative and equals one if the genotypes determine theexact amount of allele sharing by decent among the affected relatives(Gretarsdottir et al., Am. J. Hum. Genet., 70:593-603 (2002)). TheP-values were computed two different ways and the less significantresult is reported here. The first P-value can be computed on the basisof large sample theory; the distribution of Z_(lr)=□(2-[log_(c)(10)LOD])approximates a standard normal variable under the null hypothesis of nolinkage (Kong, A. and Cox, N.J., Am. J. Hum. Genet. 61:1179-88 (1997)).The second P-value can be calculated by comparing the observed LOD-scorewith its complete data sampling distribution under the null hypothesis(e.g., Gudbjartsson et al., Nat. Genet. 25:12-3 (2000)). When the dataconsist of more than a few families, these two P-values tend to be verysimilar.

Haplotypes and “Haplotype Block” Definition of a Susceptibility Locus

In certain embodiments, marker and haplotype analysis involves defininga candidate susceptibility locus based on “haplotype blocks” (alsocalled “LD blocks”). It has been reported that portions of the humangenome can be broken into series of discrete haplotype blocks containinga few common haplotypes; for these blocks, linkage disequilibrium dataprovided little evidence indicating recombination (see, e.g., Wall., J.D. and Pritchard, J. K., Nature Reviews Genetics 4:587-597 (2003); Daly,M. et al., Nature Genet. 29:229-232 (2001); Gabriel, S. B. et al.,Science 296:2225-2229 (2002); Patil, N. et al., Science 294:1719-1723(2001); Dawson, E. et al., Nature 418:544-548 (2002); Phillips, M. S. etal., Nature Genet. 33:382-387 (2003)).

There are two main methods for defining these haplotype blocks: blockscan be defined as regions of DNA that have limited haplotype diversity(see, e.g., Daly, M. et al., Nature Genet. 29:229-232 (2001); Patil, N.et al., Science 294:1719-1723 (2001); Dawson. E. et al., Nature418:544-548 (2002); Zhang, K. et al., Proc. Natl. Acad. Sci. USA99:7335-7339 (2002)), or as regions between transition zones havingextensive historical recombination, identified using linkagedisequilibrium (see, e.g., Gabriel, S. B. et al., Science 296:2225-2229(2002); Phillips, M. S. et al., Nature Genet. 33:382-387 (2003); Wang,N. et al., Am. J. Hum. Genet. 71:1227-1234 (2002); Stumpf, M. P., andGoldstein, D. B., Curr. Biol. 13:1-8 (2003)). As used herein, the terms“haplotype block” or “LD block” includes blocks defined by eithercharacteristic.

Representative methods for identification of haplotype blocks are setforth, for example, in U.S. Published Patent Application Nos.20030099964, 20030170665, 20040023237 and 20040146870. Haplotype blockscan be used readily to map associations between phenotype and haplotypestatus. The main haplotypes can be identified in each haplotype block,and then a set of “tagging” SNPs or markers (the smallest set of SNPs ormarkers needed to distinguish among the haplotypes) can then beidentified. These tagging SNPs or markers can then be used in assessmentof samples from groups of individuals, in order to identify associationbetween phenotype and haplotype. If desired, neighboring haplotypeblocks can be assessed concurrently, as there may also exist linkagedisequilibrium among the haplotype blocks.

Haplotypes and Diagnostics

As described herein, certain markers and haplotypes comprising suchmarkers are found to be useful for determination of susceptibility totype II diabetes—i.e., they are found to be useful for diagnosing asusceptibility to type II diabetes. Particular markers and haplotypesare found more frequently in individuals with type II diabetes than inindividuals without type II diabetes. Therefore, these markers andhaplotypes have predictive value for detecting type II diabetes, or asusceptibility to type II diabetes, in an individual. Haplotype blocks(i.e. the exon 4 LD block of TCF7L2) comprising certain tagging markers,can be found more frequently in individuals with type II diabetes thanin individuals without type II diabetes. Therefore, these “at-risk”tagging markers within the haplotype block also have predictive valuefor detecting type II diabetes, or a susceptibility to type II diabetes,in an individual. “At-risk” tagging markers within the haplotype or LDblocks can also include other markers that distinguish among thehaplotypes, as these similarly have predictive value for detecting typeII diabetes or a susceptibility to type II diabetes. As a consequence ofthe haplotype block structure of the human genome, a large number ofmarkers or other variants and/or haplotypes comprising such markers orvariants in association with the haplotype block (LD block) may be foundto be associated with a certain trait and/or phenotype. Thus, it ispossible that markers and/or haplotypes residing within the exon 4 LDblock of TCF7L2 as defined herein or in strong LD (characterized by r²greater than 0.2) with the exon 4 LD block of TCF7L2 are associated withtype II diabetes (i.e. they confer increased or decreased susceptibilityof type II diabetes). This includes markers that are described herein(Table 6), but may also include other markers that are in strong LD(characterized by r² greater than 0.2) with one or more of the markerslisted in Table 6. The identification of such additional variants can beachieved by methods well known to those skilled in the art, for exampleby DNA sequencing of the LD block A genomic region in particular groupof individuals, and the present invention also encompasses suchadditional variants.

As described herein, certain markers within the exon 4 LD block ofTCF7L2 are found in decreased frequency in individuals with type IIdiabetes, and haplotypes comprising two or more of those markers listedin Tables 13, 20 and 21 are also found to be present at decreasedfrequency in individuals with type II diabetes. These markers andhaplotypes are thus protective for type II diabetes, i.e. they confer adecreased risk of individuals carrying these markers and/or haplotypesdeveloping type II diabetes.

The haplotypes and markers described herein are, in some cases, acombination of various genetic markers, e.g., SNPs and microsatellites.Therefore, detecting haplotypes can be accomplished by methods known inthe art and/or described herein for detecting sequences at polymorphicsites. Furthermore, correlation between certain haplotypes or sets ofmarkers and disease phenotype can be verified using standard techniques.A representative example of a simple test for correlation would be aFisher-exact test on a two by two table.

In specific embodiments, a marker or haplotype associated with the exon4 LD block of TCF7L2 is one in which the marker or haplotype is morefrequently present in an individual at risk for type II diabetes(affected), compared to the frequency of its presence in a healthyindividual (control), wherein the presence of the marker or haplotype isindicative of type II diabetes or a susceptibility to type II diabetes.In other embodiments, at-risk tagging markers in linkage disequilibriumwith one or more markers associated with the exon 4 LD block of TCF7L2,are tagging markers that are more frequently present in an individual atrisk for type II diabetes (affected), compared to the frequency of theirpresence in a healthy individual (control), wherein the presence of thetagging markers is indicative of increased susceptibility to type IIdiabetes. In a further embodiment, at-risk markers in linkagedisequilibrium with one or more markers associated with the exon 4 LDblock of TCF7L2, are markers that are more frequently present in anindividual at risk for type II diabetes, compared to the frequency oftheir presence in a healthy individual (control), wherein the presenceof the markers is indicative of susceptibility to type II diabetes.

In certain methods described herein, an individual who is at risk fortype II diabetes is an individual in whom an at-risk marker or haplotypeis identified. In one embodiment, the strength of the association of amarker or haplotype is measured by relative risk (RR). RR is the ratioof the incidence of the condition among subjects who carry one copy ofthe marker or haplotype to the incidence of the condition among subjectswho do not carry the marker or haplotype. This ratio is equivalent tothe ratio of the incidence of the condition among subjects who carry twocopies of the marker or haplotype to the incidence of the conditionamong subjects who carry one copy of the marker or haplotype. In oneembodiment, the marker or haplotype has a relative risk of at least 1.2.In other embodiments, the marker or haplotype has a relative risk of atleast 1.3, at least 1.4, at least 1.5, at least 2.0, at least 2.5, atleast 3.0, at least 3.5, at least 4.0, or at least 5.0.

In other methods of the invention, an individual who has a decreasedrisk (or deceased susceptibility) of type II diabetes is an individualin whom a protective marker or haplotype is identified. In such cases,the relative risk (RR) is less than unity. In one embodiment, the markeror haplotype has a relative risk of less than 0.9. In anotherembodiments, the marker or haplotype has a relative risk of less than0.8, less than 0.7, less than 0.6, less than 0.5 or less than 0.4.

Utility of Genetic Testing

The knowledge about a genetic variant that confers a risk of developingtype II diabetes offers the opportunity to apply a genetic-test todistinguish between individuals with increased risk of developing thedisease (i.e. carriers of the at-risk variant) and those with decreasedrisk of developing the disease (i.e. carriers of the protectivevariant). The core values of genetic testing, for individuals belongingto both of the above mentioned groups, are the possibilities of beingable to diagnose the disease at an early stage and provide informationto the clinician about prognosis/aggressiveness of the disease in orderto be able to apply the most appropriate treatment. For example, theapplication of a genetic test for type II diabetes can provide anopportunity for the detection of the disease at an earlier stage whichmay lead to the application of therapeutic measures at an earlier stage,and thus can minimize the deleterious effects of the symptoms andserious health consequences conferred by type II diabetes.

Methods of Therapy

In another embodiment of the invention, methods can be employed for thetreatment of type II diabetes. The term “treatment” as used herein,refers not only to ameliorating symptoms associated with type IIdiabetes, but also preventing or delaying the onset of type II diabetes;lessening the severity or frequency of symptoms of type II diabetes;and/or also lessening the need for concomitant therapy with other drugsthat ameliorate symptoms associated with type II diabetes. In oneaspect, the individual to be treated is an individual who is susceptible(at an increased risk) for type II diabetes (e.g., an individual havingthe presence of an allele other than a 0 allele in marker DG10S478; thepresence of a T allele in SNP rs12255372; the presence of an A allele inSNP rs7895340; the presence of a C allele in SNP rs11196205; thepresence of a C allele in SNP rs7901695; the presence of a T allele inSNP rs7903146; the presence of a C allele in SNP rs12243326; or thepresence of an T allele in SNP rs4506565.

In additional embodiments of the invention, methods can be employed forthe treatment of other diseases or conditions associated with TCF7L2. ATCF7L2 therapeutic agent can be used both in methods of treatment oftype II diabetes, as well as in methods of treatment of other diseasesor conditions associated with TCF7L2.

The methods of treatment (prophylactic and/or therapeutic) utilize aTCF7L2 therapeutic agent. A “TCF7L2 therapeutic agent” is an agent thatalters (e.g., enhances or inhibits) polypeptide activity and/or nucleicacid expression of TCF7L2, either directly or indirectly (e.g., throughaltering activity or nucleic acid expression of a protein that interactswith TCF7L2, such as a protein in the Wnt signaling pathway or in thecadherin pathway (e.g., beta-catenin)). In certain embodiments, theTCF7L2 therapeutic agent alters activity and/or nucleic acid expressionof TCF7L2.

TCF7L2 therapeutic agents can alter TCF7L2 polypeptide activity ornucleic acid expression by a variety of means, such as, for example, byproviding additional TCF7L2 polypeptide or by upregulating thetranscription or translation of the TCF7L2 nucleic acid; by alteringposttranslational processing of the TCF7L2 polypeptide; by alteringtranscription of TCF7L2 splicing variants; or by interfering with TCF7L2polypeptide activity (e.g., by binding to a TCF7L2 polypeptide), or bybinding to another polypeptide that interacts with TCF7L2, by altering(e.g., downregulating) the expression, transcription or translation of aTCF7L2 nucleic acid, or by altering (e.g., agonizing or antagonizing)activity.

Representative TCF7L2 therapeutic agents include the following: nucleicacids or fragments or derivatives thereof described herein, particularlynucleotides encoding the polypeptides described herein and vectorscomprising such nucleic acids (e.g., a gene, cDNA, and/or mRNA, such asa nucleic acid encoding a TCF7L2 polypeptide or active fragment orderivative thereof, or an oligonucleotide; or a complement thereof, orfragments or derivatives thereof, and/or other splicing variants encodedby a Type II diabetes nucleic acid, or fragments or derivativesthereof); polypeptides described herein and/or splicing variants encodedby the TCF7L2 nucleic acid or fragments or derivatives thereof; otherpolypeptides (e.g., TCF7L2 receptors); TCF7L2 binding agents; or agentsthat affect (e.g., increase or decrease) activity, antibodies, such asan antibody to an altered TCF7L2 polypeptide, or an antibody to anon-altered TCF7L2 polypeptide, or an antibody to a particular splicingvariant encoded by a TCF7L2 nucleic acid as described above;peptidomimetics; fusion proteins or prodrugs thereof; ribozymes; othersmall molecules; and other agents that alter (e.g., enhance or inhibit)expression of a TCF7L2 nucleic acid, or that regulate transcription ofTCF7L2 splicing variants (e.g., agents that affect which splicingvariants are expressed, or that affect the amount of each splicingvariant that is expressed). Additional representative TCF7L2 therapeuticagents include compounds that influence insulin signaling and/orglucagons, GLP-1 or GIP signaling. More than one TCF7L2 therapeuticagent can be used concurrently, if desired.

In preferred embodiments, the TCF7L2 therapeutic agent is an agent thatinterferes with the activity of TCF7L2, such as, for example, an agentthat interferes with TCF7L2 binding or interaction of TCF7L2 withbeta-catenin (see, e.g., Fasolini, et al., J. Biol. Chem.278(23):21092-06 (2003)) or with other proteins. Other TCF7L2therapeutic agents include agents that affect the Wnt signaling pathwayor agents that affect the cadherin pathway. Representative agentsinclude agents such as those used for cancer therapy, including, forexample, proteins such as the DKK proteins; the beta-catenin bindingdomain of APC, or Axin; factors such as IDAX, AXAM and ICAT; antisenseoligonucleotides or RNA interference (RNAi), such as with the use ofVitravene; oncolytic viral vectors; and other compounds (see, e.g., Luuet al., Current Cancer Drug Targets 4:6530671 (2004)); small moleculeantagonists, including, for example, ZTM00990, PKF118-310, PKF118-744,PKF115-584, PKF222-815, CGPO49090, NPDDG39.024, and NPDDG1.024 asdescribed by Lepourcelet et al. (see, e.g., Lepourcelet et al., CancerCall 5:91-102 (2004)); compounds described in U.S. Pat. No. 6,762,185;compounds described in US Patent applications 20040005313, 20040072831,20040247593, or 20050059628. Other representative TCF7L2 therapeuticagents include gsk3 inhibitors, including, for example, those describedin U.S. Pat. Nos. 6,057,117; 6,153,618; 6,417,185; 6,465,231; 6,489,344;6,512,102; 6,608,063; 6,716,624; 6,800,632; and published US Patentapplications 20030008866; 20030077798; 20030130289; 20030207883;2000092535; and 200500851. The entire teachings of all of thereferences, patents and patent applications recited in the Specificationare incorporated herein in their entirety.

Additional representative TCF7L2 therapeutic agents are shown in theAgent Table, below.

Agent table Compound name (generated using Autonom, ISIS Draw Compoundversion 2.5 from MDL Compound name(s) Information Systems) CompanyReference Indications AR-0133418 1-(4-Methoxy-benzyl)-3- AstraZeneca AD(SN-4521) (5-nitro-thiazol-2-yl)- urea AR-025028 NSD AstraZenecaCT-98023 N-[4-(2,4-Dichloro- Chiron Corp non-insulinphenyl)-5-(1H-imidazol- dependent 2-yl)-pyrimidin-2-yl]- diabetesN′-(5-nitro-pyridin-2- yl)-ethane-1,2-diamine CT-20026 NSD Chiron CorpWagman et non-insulin al., Curr dependent Pharm. Des diabetes 2004:10(10) 1105-37 CT-21022 NSD Chiron Corp non-insulin dependent diabetesCT-20014 NSD Chiron Corp non-insulin dependent diabetes CT-21018 NSDChiron Corp non-insulin dependent diabetes CHIR-98025 NSD Chiron Corpnon-insulin dependent diabetes CHIR-99021 NSD Chiron Corp Wagman etnon-insulin al., Curr dependent Pharm. Des diabetes 2004: 10(10) 1105-37CrystalGenomics WO- diabetes and Yuyu 2004065370 mellitus (Korea)CG-100179 NSD 4-[2-(4-Dimethylamino- Cyclacel Ltd. non-insulin3-nitro-phenylamino)- dependent pyrimidin-4-yl]-3,5- diabetes,dimethyl-1H-pyrrole-2- among others. carbonitrile NP-01139,4-Benzyl-2-methyl- Neuropharma SA CNS disorders, NP-031112,[1,2,4]thiadiazolidine- AD NP-03112, 3,5-dione NP-00361 3-[9-Fluoro-2-Eli Lilly & Co non-insulin (piperidine-1- dependent carbonyl)-1,2,3,4-diabetes tetrahydro- [1,4]diazepino[6,7,1- hi]indol-7-yl]-4-imidazo[1,2-a]pyridin- 3-yl-pyrrole-2,5-dione GW-784752x,Cyclopentanecarboxylic GSK WO-03024447 non-insulin GW-784775, acid(6-pyridin-3-yl- (compound dependent SB-216763, furo[2,3-d]pyrimidin-4-referenced: diabetes, SB-415286 yl)-amide 4-[2-(2- neurodegenerativebromophenyl)- disease 4-(4- fluorophenyl)- 1H- imidazol-5- yl]pyridineNNC-57-0511, 1-(4-Amino-furazan-3- Novo Nordisk non-insulin NNC-57-0545,yl)-5-piperidin-1- dependent NNC-57-0588 ylmethyl-1H- diabetes,[1,2,3]triazole-4- carboxylic acid [1- pyridin-4-yl-meth-(E)-ylidene]-hydrazide CP-70949 NSD Pfizer Hypoglycemic agent VX-608 NSDCerebrovascular ischemia, non-insulin dependent diabetes NSD KinetekNuclear factor kappa B modulator, Anti- inflammatory, Cell cycleinhibitor, Glycogen synthase kinase-3 beta inhibitor KP-403 class BYETTAExenatide: C₁₈₄H₂₈₂N₅₀O₆₀ S - Amylin/Eli non-insulin (exenatide) Aminoacid Lilly & Co dependent sequence: H-His-Gly-Glu- diabetesGly-Thr-Phe-Thr-Ser- Asp-Leu-Ser-Lys-Gln- Met-Glu-Glu-Glu-Ala-Val-Arg-Leu-Phe-Ile- Glu-Trp-Leu-Lys-Asn- Gly-Gly-Pro-Ser-Ser-Gly-Ala-Pro-Pro-Pro- Ser-NH₂ Vildagliptin NSD Novartis non-insulin(LAF237) dependent diabetes - DPP-4 inhibitor NSD = No Structuredisclosed (in Iddb3)

The TCF7L2 therapeutic agent(s) are administered in a therapeuticallyeffective amount (i.e., an amount that is sufficient for “treatment,” asdescribed above). The amount which will be therapeutically effective inthe treatment of a particular individual's disorder or condition willdepend on the symptoms and severity of the disease, and can bedetermined by standard clinical techniques. In addition, in vitro or invivo assays may optionally be employed to help identify optimal dosageranges. The precise dose to be employed in the formulation will alsodepend on the route of administration, and the seriousness of thedisease or disorder, and should be decided according to the judgment ofa practitioner and each patient's circumstances. Effective doses may beextrapolated from dose-response curves derived from in vitro or animalmodel test systems.

In one embodiment, a nucleic acid (e.g., a nucleic acid encoding aTCF7L2 polypeptide); or another nucleic acid that encodes a TCF7L2polypeptide or a splicing variant, derivative or fragment thereof can beused, either alone or in a pharmaceutical composition as describedabove. For example, a TCF7L2 gene or nucleic acid or a cDNA encoding aTCF7L2 polypeptide, either by itself or included within a vector, can beintroduced into cells (either in vitro or in vivo) such that the cellsproduce native TCF7L2 polypeptide. If necessary, cells that have beentransformed with the gene or cDNA or a vector comprising the gene,nucleic acid or cDNA can be introduced (or re-introduced) into anindividual affected with the disease. Thus, cells which, in nature, lacknative TCF7L2 expression and activity, or have altered TCF7L2 expressionand activity, or have expression of a disease-associated TCF7L2 splicingvariant, can be engineered to express the TCF7L2 polypeptide or anactive fragment of the TCF7L2 polypeptide (or a different variant of theTCF7L2 polypeptide). In certain embodiments, nucleic acids encoding aTCF7L2 polypeptide, or an active fragment or derivative thereof, can beintroduced into an expression vector, such as a viral vector, and thevector can be introduced into appropriate cells in an animal. Other genetransfer systems, including viral and nonviral transfer systems, can beused. Alternatively, nonviral gene transfer methods, such as calciumphosphate coprecipitation, mechanical techniques (e.g., microinjection);membrane fusion-mediated transfer via liposomes; or direct DNA uptake,can also be used.

Alternatively, in another embodiment of the invention, a nucleic acid ofthe invention; a nucleic acid complementary to a nucleic acid of theinvention; or a portion of such a nucleic acid (e.g., an oligonucleotideas described below), can be used in “antisense” therapy, in which anucleic acid (e.g., an oligonucleotide) which specifically hybridizes tothe mRNA and/or genomic DNA of a Type II diabetes gene is administeredor generated in situ. The antisense nucleic acid that specificallyhybridizes to the mRNA and/or DNA inhibits expression of the TCF7L2polypeptide, e.g., by inhibiting translation and/or transcription.Binding of the antisense nucleic acid can be by conventional base paircomplementarity, or, for example, in the case of binding to DNAduplexes, through specific interaction in the major groove of the doublehelix.

An antisense construct of the present invention can be delivered, forexample, as an expression plasmid as described above. When the plasmidis transcribed in the cell, it produces RNA that is complementary to aportion of the mRNA and/or DNA which encodes the TCF7L2 polypeptide.Alternatively, the antisense construct can be an oligonucleotide probethat is generated ex vivo and introduced into cells; it then inhibitsexpression by hybridizing with the mRNA and/or genomic DNA of thepolypeptide. In one embodiment, the oligonucleotide probes are modifiedoligonucleotides, which are resistant to endogenous nucleases, e.g.,exonucleases and/or endonucleases, thereby rendering them stable invivo. Exemplary nucleic acid molecules for use as antisenseoligonucleotides are phosphoramidate, phosphothioate andmethylphosphonate analogs of DNA (see also U.S. Pat. Nos. 5,176,996;5,264,564; and 5,256,775). Additionally, general approaches toconstructing oligomers useful in antisense therapy are also described,for example, by Van der Krol et al., (BioTechniques 6:958-976 (1988));and Stein et al., (Cancer Res. 48:2659-2668 (1988)). With respect toantisense DNA, oligodeoxyribonucleotides derived from the translationinitiation site are preferred.

To perform antisense therapy, oligonucleotides (mRNA, cDNA or DNA) aredesigned that are complementary to mRNA encoding the TCF7L2 gene. Theantisense oligonucleotides bind to TCF7L2 mRNA transcripts and preventtranslation. Absolute complementarity, although preferred, is notrequired. A sequence “complementary” to a portion of an RNA, as referredto herein, indicates that a sequence has sufficient complementarity tobe able to hybridize with the RNA, forming a stable duplex; in the caseof double-stranded antisense nucleic acids, a single strand of theduplex DNA may thus be tested, or triplex formation may be assayed. Theability to hybridize will depend on both the degree of complementarityand the length of the antisense nucleic acid, as described in detailabove. Generally, the longer the hybridizing nucleic acid, the more basemismatches with an RNA it may contain and still form a stable duplex (ortriplex, as the case may be). One skilled in the art can ascertain atolerable degree of mismatch by use of standard procedures.

The oligonucleotides used in antisense therapy can be DNA, RNA, orchimeric mixtures or derivatives or modified versions thereof,single-stranded or double-stranded. The oligonucleotides can be modifiedat the base moiety, sugar moiety, or phosphate backbone, for example, toimprove stability of the molecule, hybridization, etc. Theoligonucleotides can include other appended groups such as peptides(e.g. for targeting host cell receptors in vivo), or agents facilitatingtransport across the cell membrane (see, e.g., Letsinger et al., Proc.Natl. Acad. Sci. USA 86:6553-6556 (1989); Lemaitre et al., Proc. Natl.Acad. Sci. USA 84:648-652 (1987); PCT International Publication NO: WO88/09810) or the blood-brain barrier (see, e.g., PCT InternationalPublication NO: WO 89/10134), or hybridization-triggered cleavage agents(see, e.g., Krol et al., BioTechniques 6:958-976 (1988)) orintercalating agents. (See, e.g., Zon, Pharm. Res. 5:539-549 (1988)). Tothis end, the oligonucleotide may be conjugated to another molecule(e.g., a peptide, hybridization triggered cross-linking agent, transportagent, hybridization-triggered cleavage agent).

The antisense molecules are delivered to cells that express TCF7L2 invivo. A number of methods can be used for delivering antisense DNA orRNA to cells; e.g., antisense molecules can be injected directly intothe tissue site, or modified antisense molecules, designed to target thedesired cells (e.g., antisense linked to peptides or antibodies thatspecifically bind receptors or antigens expressed on the target cellsurface) can be administered systematically. Alternatively, in apreferred embodiment, a recombinant DNA construct is utilized in whichthe antisense oligonucleotide is placed under the control of a strongpromoter (e.g., pol III or pol II). The use of such a construct totransfect target cells in the patient results in the transcription ofsufficient amounts of single stranded RNAs that will form complementarybase pairs with the endogenous TCF7L2 transcripts and thereby preventtranslation of the TCF7L2 mRNA. For example, a vector can be introducedin vivo such that it is taken up by a cell and directs the transcriptionof an antisense RNA. Such a vector can remain episomal or becomechromosomally integrated, as long as it can be transcribed to producethe desired antisense RNA. Such vectors can be constructed byrecombinant DNA technology methods standard in the art and describedabove. For example, a plasmid, cosmid, YAC or viral vector can be usedto prepare the recombinant DNA construct that can be introduced directlyinto the tissue site. Alternatively, viral vectors can be used whichselectively infect the desired tissue, in which case administration maybe accomplished by another route (e.g., systemically).

Endogenous TCF7L2 polypeptide expression can also be reduced byinactivating or “knocking out” the gene, nucleic acid or its promoterusing targeted homologous recombination (e.g., see Smithies et al.,Nature 317:230-234 (1985); Thomas & Capecchi, Cell 51:503-512 (1987);Thompson et al., Cell 5:313-321 (1989)). For example, an altered,non-functional gene or nucleic acid (or a completely unrelated DNAsequence) flanked by DNA homologous to the endogenous gene or nucleicacid (either the coding regions or regulatory regions of the nucleicacid) can be used, with or without a selectable marker and/or a negativeselectable marker, to transfect cells that express the gene or nucleicacid in vivo. Insertion of the DNA construct, via targeted homologousrecombination, results in inactivation of the gene or nucleic acid. Therecombinant DNA constructs can be directly administered or targeted tothe required site in vivo using appropriate vectors, as described above.Alternatively, expression of non-altered genes or nucleic acids can beincreased using a similar method: targeted homologous recombination canbe used to insert a DNA construct comprising a non-altered functionalgene or nucleic acid in place of an altered TCF7L2 in the cell, asdescribed above. In another embodiment, targeted homologousrecombination can be used to insert a DNA construct comprising a nucleicacid that encodes a Type II diabetes polypeptide variant that differsfrom that present in the cell.

Alternatively, endogenous TCF7L2 nucleic acid expression can be reducedby targeting deoxyribonucleotide sequences complementary to theregulatory region of a TCF7L2 nucleic acid (i.e., the TCF7L2 promoterand/or enhancers) to form triple helical structures that preventtranscription of the TCF7L2 nucleic acid in target cells in the body.(See generally, Helene, C., Anticancer Drug Des., 6(6):569-84 (1991);Helene, C. et al., Ann. N.Y. Acad. Sci. 660:27-36 (1992); and Maher, L.J., Bioassays 14(12):807-15 (1992)). Likewise, the antisense constructsdescribed herein, by antagonizing the normal biological activity of oneof the TCF7L2 proteins, can be used in the manipulation of tissue, e.g.,tissue differentiation, both in vivo and for ex vivo tissue cultures.Furthermore, the anti-sense techniques (e.g., microinjection ofantisense molecules, or transfection with plasmids whose transcripts areanti-sense with regard to a Type II diabetes gene mRNA or gene sequence)can be used to investigate the role of TCF7L2 or the interaction ofTCF7L2 and its binding agents in developmental events, as well as thenormal cellular function of TCF7L2 or of the interaction of TCF7L2 andits binding agents in adult tissue. Such techniques can be utilized incell culture, but can also be used in the creation of transgenicanimals.

In yet another embodiment of the invention, other TCF7L2 therapeuticagents as described herein can also be used in the treatment of Type IIdiabetes gene. The therapeutic agents can be delivered in a composition,as described above, or by themselves. They can be administeredsystemically, or can be targeted to a particular tissue. The therapeuticagents can be produced by a variety of means, including chemicalsynthesis; recombinant production; in vivo production (e.g., atransgenic animal, such as U.S. Pat. No. 4,873,316 to Meade et al.), forexample, and can be isolated using standard means such as thosedescribed herein.

A combination of any of the above methods of treatment (e.g.,administration of non-altered polypeptide in conjunction with antisensetherapy targeting altered mRNA of TCF7L2; administration of a firstsplicing variant encoded by a TCF7L2 nucleic acid in conjunction withantisense therapy targeting a second splicing encoded by a TCF7L2nucleic acid) can also be used.

Methods of Assessing Probability of Response to TCF7L2 TherapeuticAgents

The present invention additionally pertains to methods of assessing anindividual's probability of response to a TCF7L2 therapeutic agent. Inthe methods, markers or haplotypes relating to the TCF7L2 gene areassessed, as described above in relation to assessing an individual forsusceptibility to type II diabetes. The presence of an allele, marker,SNP or haplotype associated with susceptibility (increased risk) fortype II diabetes (e.g., an allele other than a 0 allele in markerDG10S478; a T allele in SNP rs12255372; an A allele in SNP rs7895340; aC allele in SNP rs11196205; a C allele in SNP rs7901695; a T allele inSNP rs7903146; a C allele in SNP rs12243326; an T allele in SNPrs4506565; a marker associated with the exon 4 LD block of TCF7L2, suchas an at-risk haplotype associated with the exon 4 LD block of TCF7L2);is indicative of a probability of a positive response to a TCF7L2therapeutic agent.

“Probability of a positive response” indicates that the individual ismore likely to have a positive response to a TCF7L2 therapeutic agentthan an individual not having an allele, marker, SNP or haplotypeassociated with susceptibility (increased risk) for type II diabetes asdescribed herein. A “positive response” to a TCF7L2 therapeutic agent isa physiological response that indicates treatment of type II diabetes.As described above, “treatment” refers not only to ameliorating symptomsassociated with type II diabetes, but also preventing or delaying theonset of type II diabetes; lessening the severity or frequency ofsymptoms of type II diabetes; and/or also lessening the need forconcomitant therapy with other drugs that ameliorate symptoms associatedwith type II diabetes.

Pharmaceutical Compositions

The present invention also pertains to pharmaceutical compositionscomprising agents that alter TCF7L2 activity or which otherwise affectthe Wnt signaling pathway or the cadherin pathway, or which can be usedas TCF7L2 therapeutic agents. The pharmaceutical compositions can beformulated with a physiologically acceptable carrier or excipient toprepare a pharmaceutical composition. The carrier and composition can besterile. The formulation should suit the mode of administration.

Suitable pharmaceutically acceptable carriers include but are notlimited to water, salt solutions (e.g., NaCl), saline, buffered saline,alcohols, glycerol, ethanol, gum arabic, vegetable oils, benzylalcohols, polyethylene glycols, gelatin, carbohydrates such as lactose,amylose or starch, dextrose, magnesium stearate, talc, silicic acid,viscous paraffin, perfume oil, fatty acid esters,hydroxymethylcellulose, polyvinyl pyrolidone, etc., as well ascombinations thereof. The pharmaceutical preparations can, if desired,be mixed with auxiliary agents, e.g., lubricants, preservatives,stabilizers, wetting agents, emulsifiers, salts for influencing osmoticpressure, buffers, coloring, flavoring and/or aromatic substances andthe like which do not deleteriously react with the active agents.

The composition, if desired, can also contain minor amounts of wettingor emulsifying agents, or pH buffering agents. The composition can be aliquid solution, suspension, emulsion, tablet, pill, capsule, sustainedrelease formulation, or powder. The composition can be formulated as asuppository, with traditional binders and carriers such astriglycerides. Oral formulation can include standard carriers such aspharmaceutical grades of mannitol, lactose, starch, magnesium stearate,polyvinyl pyrollidone, sodium saccharine, cellulose, magnesiumcarbonate, etc.

Methods of introduction of these compositions include, but are notlimited to, intradermal, intramuscular, intraperitoneal, intraocular,intravenous, subcutaneous, topical, oral and intranasal. Other suitablemethods of introduction can also include gene therapy (as describedbelow), rechargeable or biodegradable devices, particle accelerationdevises (“gene guns”) and slow release polymeric devices. Thepharmaceutical compositions of this invention can also be administeredas part of a combinatorial therapy with other agents.

The composition can be formulated in accordance with the routineprocedures as a pharmaceutical composition adapted for administration tohuman beings. For example, compositions for intravenous administrationtypically are solutions in sterile isotonic aqueous buffer. Wherenecessary, the composition may also include a solubilizing agent and alocal anesthetic to ease pain at the site of the injection. Generally,the ingredients are supplied either separately or mixed together in unitdosage form, for example, as a dry lyophilized powder or water freeconcentrate in a hermetically sealed container such as an ampule orsachette indicating the quantity of active agent. Where the compositionis to be administered by infusion, it can be dispensed with an infusionbottle containing sterile pharmaceutical grade water, saline ordextrose/water. Where the composition is administered by injection, anampule of sterile water for injection or saline can be provided so thatthe ingredients may be mixed prior to administration.

For topical application, nonsprayable forms, viscous to semi-solid orsolid forms comprising a carrier compatible with topical application andhaving a dynamic viscosity preferably greater than water, can beemployed. Suitable formulations include but are not limited tosolutions, suspensions, emulsions, creams, ointments, powders, enemas,lotions, sols, liniments, salves, aerosols, etc., which are, if desired,sterilized or mixed with auxiliary agents, e.g., preservatives,stabilizers, wetting agents, buffers or salts for influencing osmoticpressure, etc. The agent may be incorporated into a cosmeticformulation. For topical application, also suitable are sprayableaerosol preparations wherein the active ingredient, preferably incombination with a solid or liquid inert carrier material, is packagedin a squeeze bottle or in admixture with a pressurized volatile,normally gaseous propellant, e.g., pressurized air.

Agents described herein can be formulated as neutral or salt forms.Pharmaceutically acceptable salts include those formed with free aminogroups such as those derived from hydrochloric, phosphoric, acetic,oxalic, tartaric acids, etc., and those formed with free carboxyl groupssuch as those derived from sodium, potassium, ammonium, calcium, ferrichydroxides, isopropylamine, triethylamine, 2-ethylamino ethanol,histidine, procaine, etc.

The agents are administered in a therapeutically effective amount. Theamount of agents which will be therapeutically effective depends in parton the nature of the disorder and/or extent of symptoms, and can bedetermined by standard clinical techniques. In addition, in vitro or invivo assays may optionally be employed to help identify optimal dosageranges. The precise dose to be employed in the formulation will alsodepend on the route of administration, and the seriousness of thesymptoms, and should be decided according to the judgment of apractitioner and each patient's circumstances. Effective doses may beextrapolated from dose-response curves derived from in vitro or animalmodel test systems.

The invention also provides a pharmaceutical pack or kit comprising oneor more containers filled with one or more of the ingredients of thepharmaceutical compositions of the invention. Optionally associated withsuch container(s) can be a notice in the form prescribed by agovernmental agency regulating the manufacture, use or sale ofpharmaceuticals or biological products, which notice reflects approvalby the agency of manufacture, use of sale for human administration. Thepack or kit can be labeled with information regarding mode ofadministration, sequence of drug administration (e.g., separately,sequentially or concurrently), or the like. The pack or kit may alsoinclude means for reminding the patient to take the therapy. The pack orkit can be a single unit dosage of the combination therapy or it can bea plurality of unit dosages. In particular, the agents can be separated,mixed together in any combination, present in a single vial or tablet.Agents assembled in a blister pack or other dispensing means ispreferred. For the purpose of this invention, unit dosage is intended tomean a dosage that is dependent on the individual pharmacodynamics ofeach agent and administered in FDA approved dosages in standard timecourses.

Screening Assays and Agents Identified Thereby

The invention also provides methods for identifying agents (e.g., fusionproteins, polypeptides, peptidomimetics, prodrugs, receptors, bindingagents, antibodies, small molecules or other drugs, or ribozymes) whichalter (e.g., increase or decrease) the activity of the TCF7L2, whichotherwise interact with TCF7L2 or with another member of the Wntsignaling pathway or the cadherin pathway (e.g., beta-catenin). Forexample, in certain embodiments, such agents can be agents which bind toTCF7L2; which have a stimulatory or inhibitory effect on, for example,activity of TCF7L2; or which change (e.g., enhance or inhibit) theability of TCF7L2 to interact with other members of the Wnt signalingpathway or with members of the cadherin pathway, or which alterposttranslational processing of TCF7L2. In other embodiments, suchagents can be agents which alter activity or function of the Wntsignaling pathway or the cadherin pathway.

In one embodiment, the invention provides assays for screening candidateor test agents that bind to or modulate the activity of TCF7L2 protein(or biologically active portion(s) thereof), as well as agentsidentifiable by the assays. Test agents can be obtained using any of thenumerous approaches in combinatorial library methods known in the art,including: biological libraries; spatially addressable parallel solidphase or solution phase libraries; synthetic library methods requiringdeconvolution; the ‘one-bead one-compound’ library method; and syntheticlibrary methods using affinity chromatography selection. The biologicallibrary approach is limited to polypeptide libraries, while the otherfour approaches are applicable to polypeptide, non-peptide oligomer orsmall molecule libraries of compounds (Lam, K. S., Anticancer Drug Des.12:145 (1997)).

In one embodiment, to identify agents which alter the activity ofTCF7L2, a cell, cell lysate, or solution containing or expressingTCF7L2, or a fragment or derivative thereof, can be contacted with anagent to be tested; alternatively, the protein can be contacted directlywith the agent to be tested. The level (amount) of TCF7L2 activity isassessed (e.g., the level (amount) of TCF7L2 activity is measured,either directly or indirectly), and is compared with the level ofactivity in a control (i.e., the level of activity of the TCF7L2 proteinor active fragment or derivative thereof in the absence of the agent tobe tested). If the level of the activity in the presence of the agentdiffers, by an amount that is statistically significant, from the levelof the activity in the absence of the agent, then the agent is an agentthat alters the activity of TCF7L2. An increase in the level of activityrelative to a control, indicates that the agent is an agent thatenhances (is an agonist of) activity. Similarly, a decrease in the levelof activity relative to a control, indicates that the agent is an agentthat inhibits (is an antagonist of) activity. In another embodiment, thelevel of activity of TCF7L2 or a derivative or fragment thereof in thepresence of the agent to be tested, is compared with a control levelthat has previously been established. A level of the activity in thepresence of the agent that differs from the control level by an amountthat is statistically significant indicates that the agent alters TCF7L2activity.

The present invention also relates to an assay for identifying agentswhich alter the expression of the TCF7L2 gene (e.g., antisense nucleicacids, fusion proteins, polypeptides, peptidomimetics, prodrugs,receptors, binding agents, antibodies, small molecules or other drugs,or ribozymes) which alter (e.g., increase or decrease) expression (e.g.,transcription or translation) of the gene or which otherwise interactwith TCF7L2, as well as agents identifiable by the assays. For example,a solution containing a nucleic acid encoding a TCF7L2 can be contactedwith an agent to be tested. The solution can comprise, for example,cells containing the nucleic acid or cell lysate containing the nucleicacid; alternatively, the solution can be another solution that compriseselements necessary for transcription/translation of the nucleic acid.Cells not suspended in solution can also be employed, if desired. Thelevel and/or pattern of TCF7L2 expression (e.g., the level and/orpattern of mRNA or of protein expressed, such as the level and/orpattern of different splicing variants) is assessed, and is comparedwith the level and/or pattern of expression in a control (i.e., thelevel and/or pattern of the TCF7L2 expression in the absence of theagent to be tested). If the level and/or pattern in the presence of theagent differs, by an amount or in a manner that is statisticallysignificant, from the level and/or pattern in the absence of the agent,then the agent is an agent that alters the expression of a Type IIdiabetes gene. Enhancement of TCF7L2 expression indicates that the agentis an agonist of TCF7L2 activity. Similarly, inhibition of TCF7L2expression indicates that the agent is an antagonist of TCF7L2 activity.In another embodiment, the level and/or pattern of TCF7L2 polypeptide(s)(e.g., different splicing variants) in the presence of the agent to betested, is compared with a control level and/or pattern that havepreviously been established. A level and/or pattern in the presence ofthe agent that differs from the control level and/or pattern by anamount or in a manner that is statistically significant indicates thatthe agent alters TCF7L2 expression.

In another embodiment of the invention, agents which alter theexpression of TCF7L2 or which otherwise interact with TCF7L2 or withanother member of the Wnt signaling pathway or the cadherin pathway, canbe identified using a cell, cell lysate, or solution containing anucleic acid encoding the promoter region of the TCF7L2 gene or nucleicacid operably linked to a reporter gene. After contact with an agent tobe tested, the level of expression of the reporter gene (e.g., the levelof mRNA or of protein expressed) is assessed, and is compared with thelevel of expression in a control (i.e., the level of the expression ofthe reporter gene in the absence of the agent to be tested). If thelevel in the presence of the agent differs, by an amount or in a mannerthat is statistically significant, from the level in the absence of theagent, then the agent is an agent that alters the expression of TCF7L2,as indicated by its ability to alter expression of a gene that isoperably linked to the TCF7L2 gene promoter. Enhancement of theexpression of the reporter indicates that the agent is an agonist ofTCF7L2 activity. Similarly, inhibition of the expression of the reporterindicates that the agent is an antagonist of TCF7L2 activity. In anotherembodiment, the level of expression of the reporter in the presence ofthe agent to be tested is compared with a control level that haspreviously been established. A level in the presence of the agent thatdiffers from the control level by an amount or in a manner that isstatistically significant indicates that the agent alters expression.

Agents which alter the amounts of different splicing variants encoded byTCF7L2 (e.g., an agent which enhances activity of a first splicingvariant, and which inhibits activity of a second splicing variant), aswell as agents which are agonists of activity of a first splicingvariant and antagonists of activity of a second splicing variant, caneasily be identified using these methods described above.

In other embodiments of the invention, assays can be used to assess theimpact of a test agent on the activity of a polypeptide in relation to aTCF7L2 binding agent. For example, a cell that expresses a compound thatinteracts with a TCF7L2 polypeptide (herein referred to as a “TCF7L2binding agent”, which can be a polypeptide or other molecule thatinteracts directly or indirectly with a TCF7L2 polypeptide, such as amember of the Wnt signaling pathway or a member of the cadherin pathway)is contacted with TCF7L2 in the presence of a test agent, and theability of the test agent to alter the interaction between the TCF7L2and the TCF7L2 binding agent is determined. Alternatively, a cell lysateor a solution containing the TCF7L2 binding agent, can be used. An agentthat binds to the TCF7L2 or the TCF7L2 binding agent can alter theinteraction by interfering with, or enhancing the ability of the TCF7L2to bind to, associate with, or otherwise interact with the TCF7L2binding agent. Determining the ability of the test agent to bind toTCF7L2 or a TCF7L2 binding agent can be accomplished, for example, bycoupling the test agent with a radioisotope or enzymatic label such thatbinding of the test agent to the polypeptide can be determined bydetecting the labeled with ¹²⁵I, ³⁵S, ¹⁴C or ³H, either directly orindirectly, and the radioisotope detected by direct counting ofradioemission or by scintillation counting. Alternatively, test agentscan be enzymatically labeled with, for example, horseradish peroxidase,alkaline phosphatase, or luciferase, and the enzymatic label detected bydetermination of conversion of an appropriate substrate to product. Itis also within the scope of this invention to determine the ability of atest agent to interact with the polypeptide without the labeling of anyof the interactants. For example, a microphysiometer can be used todetect the interaction of a test agent with TCF7L2 or a TCF7L2 bindingagent without the labeling of either the test agent, TCF7L2, or theTCF7L2 binding agent. McConnell, H. M. et al., Science 257:1906-1912(1992). As used herein, a “microphysiometer” (e.g., Cytosensor™) is ananalytical instrument that measures the rate at which a cell acidifiesits environment using a light-addressable potentiometric sensor (LAPS).Changes in this acidification rate can be used as an indicator of theinteraction between ligand and polypeptide.

Thus, these receptors can be used to screen for compounds that areagonists or antagonists, for use in treating or studying asusceptibility to type II diabetes. Drugs could be designed to regulateTCF7L2 activation that in turn can be used to regulate signalingpathways and transcription events of genes downstream.

In another embodiment of the invention, assays can be used to identifypolypeptides that interact with TCF7L2. For example, a yeast two-hybridsystem such as that described by Fields and Song (Fields, S, and Song,O., Nature 340:245-246 (1989)) can be used to identify polypeptides thatinteract with TCF7L2. In such a yeast two-hybrid system, vectors areconstructed based on the flexibility of a transcription factor that hastwo functional domains (a DNA binding domain and a transcriptionactivation domain). If the two domains are separated but fused to twodifferent proteins that interact with one another, transcriptionalactivation can be achieved, and transcription of specific markers (e.g.,nutritional markers such as His and Ade, or color markers such as lacZ)can be used to identify the presence of interaction and transcriptionalactivation. For example, in the methods of the invention, a first vectoris used which includes a nucleic acid encoding a DNA binding domain andalso TCF7L2, splicing variant, or fragment or derivative thereof, and asecond vector is used which includes a nucleic acid encoding atranscription activation domain and also a nucleic acid encoding apolypeptide which potentially may interact with TCF7L2 or a splicingvariant, or fragment or derivative thereof. Incubation of yeastcontaining the first vector and the second vector under appropriateconditions (e.g., mating conditions such as used in the Matchmaker™system from Clontech (Palo Alto, Calif., USA)) allows identification ofcolonies that express the markers of interest. These colonies can beexamined to identify the polypeptide(s) that interact with TCF7L2 orfragment or derivative thereof. Such polypeptides can be used as agentsthat alter the activity of expression of TCF7L2, as described inrelation to methods of treatment.

In more than one embodiment of the above assay methods of the presentinvention, it may be desirable to immobilize either the TCF7L2 gene, theTCF7L2 protein, the TCF7L2 binding agent (e.g., another member of theWnt signaling pathway or member of the cadherin pathway), or othercomponents of the assay on a solid support, in order to facilitateseparation of complexed from uncomplexed forms of one or both of theproteins, as well as to accommodate automation of the assay. Binding ofa test agent to the protein, or interaction of the protein with abinding agent in the presence and absence of a test agent, can beaccomplished in any vessel suitable for containing the reactants.Examples of such vessels include microtitre plates, test tubes, andmicro-centrifuge tubes. In one embodiment, a fusion protein (e.g., aglutathione-S-transferase fusion protein) can be provided which adds adomain that allows TCF7L2, TCF7L2 protein, or a TCF7L2 binding agent tobe bound to a matrix or other solid support.

In another embodiment, modulators of expression of nucleic acidmolecules of the invention are identified in a method wherein a cell,cell lysate, or solution containing TCF7L2 is contacted with a testagent and the expression of appropriate mRNA or polypeptide (e.g.,splicing variant(s)) in the cell, cell lysate, or solution, isdetermined. The level of expression of appropriate mRNA orpolypeptide(s) in the presence of the test agent is compared to thelevel of expression of mRNA or polypeptide(s) in the absence of the testagent. The test agent can then be identified as a modulator ofexpression based on this comparison. For example, when expression ofmRNA or polypeptide is greater (statistically significantly greater) inthe presence of the test agent than in its absence, the test agent isidentified as a stimulator or enhancer of the mRNA or polypeptideexpression. Alternatively, when expression of the mRNA or polypeptide isless (statistically significantly less) in the presence of the testagent than in its absence, the test agent is identified as an inhibitorof the mRNA or polypeptide expression. The level of mRNA or polypeptideexpression in the cells can be determined by methods described hereinfor detecting mRNA or polypeptide.

This invention further pertains to novel agents identified by theabove-described screening assays. Accordingly, it is within the scope ofthis invention to further use an agent identified as described herein inthe methods of treatment described herein. For example, an agentidentified as described herein can be used to alter activity of aprotein encoded by a TCF7L2 gene, or to alter expression of TCF7L2 bycontacting the protein or the nucleic acid (or contacting a cellcomprising the polypeptide or the nucleic acid) with the agentidentified as described herein.

Nucleic Acids of the Invention TCF7L2 Nucleic Acids, Portions andVariants

The present invention also pertains to isolated nucleic acid moleculescomprising human TCF7L2. The TCF7L2 nucleic acid molecules of thepresent invention can be RNA, for example, mRNA, or DNA, such as cDNAand genomic DNA. DNA molecules can be double-stranded orsingle-stranded; single stranded RNA or DNA can be the coding, or sense,strand or the non-coding, or antisense strand. The nucleic acid moleculecan include all or a portion of the coding sequence of the gene and canfurther comprise additional non-coding sequences such as introns andnon-coding 3′ and 5′ sequences (including regulatory sequences, forexample).

Additionally, nucleic acid molecules of the invention can be fused to amarker sequence, for example, a sequence that encodes a polypeptide toassist in isolation or purification of the polypeptide. Such sequencesinclude, but are not limited to, those that encode aglutathione-S-transferase (GST) fusion protein and those that encode ahemagglutinin A (HA) polypeptide marker from influenza.

An “isolated” nucleic acid molecule, as used herein, is one that isseparated from nucleic acids that normally flank the gene or nucleotidesequence (as in genomic sequences) and/or has been completely orpartially purified from other transcribed sequences (e.g., as in an RNAlibrary). For example, an isolated nucleic acid of the invention may besubstantially isolated with respect to the complex cellular milieu inwhich it naturally occurs, or culture medium when produced byrecombinant techniques, or chemical precursors or other chemicals whenchemically synthesized. In some instances, the isolated material willform part of a composition (for example, a crude extract containingother substances), buffer system or reagent mix. In other circumstances,the material may be purified to essential homogeneity, for example asdetermined by PAGE or column chromatography such as HPLC. Preferably, anisolated nucleic acid molecule comprises at least about 50, 80 or 90%(on a molar basis) of all macromolecular species present. With regard togenomic DNA, the term “isolated” also can refer to nucleic acidmolecules that are separated from the chromosome with which the genomicDNA is naturally associated. For example, the isolated nucleic acidmolecule can contain less than about 5 kb but not limited to 4 kb, 3 kb,2 kb, 1 kb, 0.5 kb or 0.1 kb of nucleotides which flank the nucleic acidmolecule in the genomic DNA of the cell from which the nucleic acidmolecule is derived.

The nucleic acid molecule can be fused to other coding or regulatorysequences and still be considered isolated. Thus, recombinant DNAcontained in a vector is included in the definition of “isolated” asused herein. Also, isolated nucleic acid molecules include recombinantDNA molecules in heterologous host cells, as well as partially orsubstantially purified DNA molecules in solution. “Isolated” nucleicacid molecules also encompass in vivo and in vitro RNA transcripts ofthe DNA molecules of the present invention. An isolated nucleic acidmolecule can include a nucleic acid molecule or nucleic acid sequencethat is synthesized chemically or by recombinant means. Therefore,recombinant DNA contained in a vector is included in the definition of“isolated” as used herein. Also, isolated nucleic acid molecules includerecombinant DNA molecules in heterologous organisms, as well aspartially or substantially purified DNA molecules in solution. In vivoand in vitro RNA transcripts of the DNA molecules of the presentinvention are also encompassed by “isolated” nucleic acid sequences.Such isolated nucleic acid molecules are useful in the manufacture ofthe encoded polypeptide, as probes for isolating homologous sequences(e.g., from other mammalian species), for gene mapping (e.g., by in situhybridization with chromosomes), or for detecting expression of the genein tissue (e.g., human tissue), such as by Northern or Southern blotanalysis.

The present invention also pertains to nucleic acid molecules which arenot necessarily found in nature but which encode a TCF7L2 polypeptide,or another splicing variant of a TCF7L2 polypeptide or polymorphicvariant thereof. Thus, for example, the invention pertains to DNAmolecules comprising a sequence that is different from the naturallyoccurring nucleotide sequence but which, due to the degeneracy of thegenetic code, encode a TCF7L2 polypeptide of the present invention.

The invention also encompasses nucleic acid molecules encoding portions(fragments), or encoding variant polypeptides such as analogues orderivatives of a TCF7L2 polypeptide. Such variants can be naturallyoccurring, such as in the case of allelic variation or single nucleotidepolymorphisms, or non-naturally-occurring, such as those induced byvarious mutagens and mutagenic processes. Intended variations include,but are not limited to, addition, deletion and substitution of one ormore nucleotides that can result in conservative or non-conservativeamino acid changes, including additions and deletions. Preferably thenucleotide (and/or resultant amino acid) changes are silent orconserved; that is, they do not alter the characteristics or activity ofa TCF7L2 polypeptide. In one aspect, the nucleic acid sequences arefragments that comprise one or more polymorphic microsatellite markers.In another aspect, the nucleotide sequences are fragments that compriseone or more single nucleotide polymorphisms in a TCF7L2 gene.

Other alterations of the nucleic acid molecules of the invention caninclude, for example, labeling, methylation, internucleotidemodifications such as uncharged linkages (e.g., methyl phosphonates,phosphotriesters, phosphoamidates, carbamates), charged linkages (e.g.,phosphorothioates, phosphorodithioates), pendent moieties (e.g.,polypeptides), intercalators (e.g., acridine, psoralen), chelators,alkylators, and modified linkages (e.g., alpha anomeric nucleic acids).Also included are synthetic molecules that mimic nucleic acid moleculesin the ability to bind to a designated sequence via hydrogen bonding andother chemical interactions. Such molecules include, for example, thosein which peptide linkages substitute for phosphate linkages in thebackbone of the molecule.

The invention also pertains to nucleic acid molecules that hybridizeunder high stringency hybridization conditions, such as for selectivehybridization, to a nucleotide sequence described herein (e.g., nucleicacid molecules which specifically hybridize to a nucleotide sequenceencoding polypeptides described herein, and, optionally, have anactivity of the polypeptide). In one aspect, the invention includesvariants described herein that hybridize under high stringencyhybridization conditions (e.g., for selective hybridization) to anucleotide sequence encoding an amino acid sequence or a polymorphicvariant thereof. In another aspect, the variant that hybridizes underhigh stringency hybridizations has an activity of a TCF7L2 polypeptide.

Such nucleic acid molecules can be detected and/or isolated by specifichybridization (e.g., under high stringency conditions). “Specifichybridization,” as used herein, refers to the ability of a first nucleicacid to hybridize to a second nucleic acid in a manner such that thefirst nucleic acid does not hybridize to any nucleic acid other than tothe second nucleic acid (e.g., when the first nucleic acid has a highersimilarity to the second nucleic acid than to any other nucleic acid ina sample wherein the hybridization is to be performed). “Stringencyconditions” for hybridization is a term of art which refers to theincubation and wash conditions, e.g., conditions of temperature andbuffer concentration, which permit hybridization of a particular nucleicacid to a second nucleic acid; the first nucleic acid may be perfectly(i.e., 100%) complementary to the second, or the first and second mayshare some degree of complementarity which is less than perfect (e.g.,70%, 75%, 85%, 90%, 95%). For example, certain high stringencyconditions can be used which distinguish perfectly complementary nucleicacids from those of less complementarity. “High stringency conditions”,“moderate stringency conditions” and “low stringency conditions”, aswell as methods for nucleic acid hybridizations are explained on pages2.10.1-2.10.16 and pages 6.3.1-6.3.6 in Current Protocols in MolecularBiology (Ausubel, F. et al., “Current Protocols in Molecular Biology”,John Wiley & Sons, (1998)), and in Kraus, M. and Aaronson, S., MethodsEnzymol., 200:546-556 (1991),

The percent homology or identity of two nucleotide or amino acidsequences can be determined by aligning the sequences for optimalcomparison purposes (e.g., gaps can be introduced in the sequence of afirst sequence for optimal alignment). The nucleotides or amino acids atcorresponding positions are then compared, and the percent identitybetween the two sequences is a function of the number of identicalpositions shared by the sequences (i.e., % identity=# of identicalpositions/total # of positions×100). When a position in one sequence isoccupied by the same nucleotide or amino acid residue as thecorresponding position in the other sequence, then the molecules arehomologous at that position. As used herein, nucleic acid or amino acid“homology” is equivalent to nucleic acid or amino acid “identity”. Incertain aspects, the length of a sequence aligned for comparisonpurposes is at least 30%, for example, at least 40%, in certain aspectsat least 60%, and in other aspects at least 70%, 80%, 90% or 95% of thelength of the reference sequence. The actual comparison of the twosequences can be accomplished by well-known methods, for example, usinga mathematical algorithm. A preferred, non-limiting example of such amathematical algorithm is described in Karlin et al., Proc. Natl. Acad.Sci. USA 90:5873-5877 (1993). Such an algorithm is incorporated into theNBLAST and XBLAST programs (version 2.0) as described in Altschul etal., Nucleic Acids Res. 25:389-3402 (1997). When utilizing BLAST andGapped BLAST programs, the default parameters of the respective programs(e.g., NBLAST) can be used. In one aspect, parameters for sequencecomparison can be set at score=100, wordlength=12, or can be varied(e.g., W=5 or W=20).

Another preferred non-limiting example of a mathematical algorithmutilized for the comparison of sequences is the algorithm of Myers andMiller, CABIOS 4(1): 11-17 (1988). Such an algorithm is incorporatedinto the ALIGN program (version 2.0) which is part of the GCG sequencealignment software package (Accelrys, Cambridge, UK). When utilizing theALIGN program for comparing amino acid sequences, a PAM 120 weightresidue table, a gap length penalty of 12, and a gap penalty of 4 can beused. Additional algorithms for sequence analysis are known in the artand include ADVANCE and ADAM as described in Torellis and Robotti,Comput. Appl. Biosci. 10:3-5 (1994); and FASTA described in Pearson andLipman, Proc. Natl. Acad. Sci. USA 85:2444-8 (1988).

In another aspect, the percent identity between two amino acid sequencescan be accomplished using the GAP program in the GCG software packageusing either a BLOSUM63 matrix or a PAM250 matrix, and a gap weight of12, 10, 8, 6, or 4 and a length weight of 2, 3, or 4. In yet anotheraspect, the percent identity between two nucleic acid sequences can beaccomplished using the GAP program in the GCG software package using agap weight of 50 and a length weight of 3.

The present invention also provides isolated nucleic acid molecules thatcontain a fragment or portion that hybridizes under highly stringentconditions to a nucleotide sequence of TCF7L2, or the complement of sucha sequence, and also provides isolated nucleic acid molecules thatcontain a fragment or portion that hybridizes under highly stringentconditions to a nucleotide sequence encoding an amino acid sequence orpolymorphic variant thereof. The nucleic acid fragments of the inventionare at least about 15, preferably at least about 18, 20, 23 or 25nucleotides, and can be 30, 40, 50, 100, 200 or more nucleotides inlength. Longer fragments, for example, 30 or more nucleotides in length,which encode antigenic polypeptides described herein, are particularlyuseful, such as for the generation of antibodies as described below.

Probes and Primers

In a related aspect, the nucleic acid fragments of the invention areused as probes or primers in assays such as those described herein.“Probes” or “primers” are oligonucleotides that hybridize in abase-specific manner to a complementary strand of nucleic acidmolecules. Such probes and primers include polypeptide nucleic acids, asdescribed in Nielsen et al., Science 254:1497-1500 (1991).

A probe or primer comprises a region of nucleotide sequence thathybridizes to at least about 15, for example about 20-25, and in certainaspects about 40, 50 or 75, consecutive nucleotides of a nucleic acidmolecule comprising a contiguous nucleotide sequence of TCF7L2 orpolymorphic variant thereof. In other aspects, a probe or primercomprises 100 or fewer nucleotides, in certain aspects from 6 to 50nucleotides, for example from 12 to 30 nucleotides. In other aspects,the probe or primer is at least 70% identical to the contiguousnucleotide sequence or to the complement of the contiguous nucleotidesequence, for example at least 80% identical, in certain aspects atleast 90% identical, and in other aspects at least 95% identical, oreven capable of selectively hybridizing to the contiguous nucleotidesequence or to the complement of the contiguous nucleotide sequence.Often, the probe or primer further comprises a label, e.g.,radioisotope, fluorescent compound, enzyme, or enzyme co-factor.

The nucleic acid molecules of the invention such as those describedabove can be identified and isolated using standard molecular biologytechniques and the sequence information provided herein. For example,nucleic acid molecules can be amplified and isolated by the polymerasechain reaction using synthetic oligonucleotide primers designed based onthe sequence of TCF7L2 or the complement of such a sequence, or designedbased on nucleotides based on sequences encoding one or more of theamino acid sequences provided herein. See generally PCR Technology:Principles and Applications for DNA Amplification (ed. H. A. Erlich,Freeman Press, NY, N.Y., 1992); PCR Protocols: A Guide to Methods andApplications (Eds. Innis et al., Academic Press, San Diego, Calif.,1990); Mattila et al., Nucl. Acids Res. 19: 4967 (1991); Eckert et al.,PCR Methods and Applications 1:17 (1991); PCR (eds. McPherson et al.,IRL Press, Oxford); and U.S. Pat. No. 4,683,202. The nucleic acidmolecules can be amplified using cDNA, mRNA or genomic DNA as atemplate, cloned into an appropriate vector and characterized by DNAsequence analysis.

Other suitable amplification methods include the ligase chain reaction(LCR) (see Wu and Wallace, Genomics 4:560 (1989), Landegren et al.,Science 241:1077 (1988), transcription amplification (Kwoh et al., Proc.Natl. Acad. Sci. USA 86:1173 (1989)), and self-sustained sequencereplication (Guatelli et al., Proc. Nat. Acad. Sci. USA 87:1874 (1990))and nucleic acid based sequence amplification (NASBA). The latter twoamplification methods involve isothermal reactions based on isothermaltranscription, which produce both single stranded RNA (ssRNA) and doublestranded DNA (dsDNA) as the amplification products in a ratio of about30 or 100 to 1, respectively.

The amplified DNA can be labeled, for example, radiolabeled, and used asa probe for screening a cDNA library derived from human cells, mRNA inzap express, ZIPLOX or other suitable vector. Corresponding clones canbe isolated, DNA can obtained following in vivo excision, and the clonedinsert can be sequenced in either or both orientations by art recognizedmethods to identify the correct reading frame encoding a polypeptide ofthe appropriate molecular weight. For example, the direct analysis ofthe nucleotide sequence of nucleic acid molecules of the presentinvention can be accomplished using well-known methods that arecommercially available. See, for example, Sambrook et al., MolecularCloning, A Laboratory Manual (2nd Ed., CSHP, New York 1989); Zyskind etal., Recombinant DNA Laboratory Manual, (Acad. Press, 1988)).Additionally, fluorescence methods are also available for analyzingnucleic acids (Chen et al., Genome Res. 9, 492 (1999)) and polypeptides.Using these or similar methods, the polypeptide and the DNA encoding thepolypeptide can be isolated, sequenced and further characterized.

Antisense nucleic acid molecules of the invention can be designed usingthe nucleotide sequence of TCF7L2 and/or the complement or a portion,and constructed using chemical synthesis and enzymatic ligationreactions using procedures known in the art. For example, an antisensenucleic acid molecule (e.g., an antisense oligonucleotide) can bechemically synthesized using naturally occurring nucleotides orvariously modified nucleotides designed to increase the biologicalstability of the molecules or to increase the physical stability of theduplex formed between the antisense and sense nucleic acids, e.g.,phosphorothioate derivatives and acridine substituted nucleotides can beused. Alternatively, the antisense nucleic acid molecule can be producedbiologically using an expression vector into which a nucleic acidmolecule has been subcloned in an antisense orientation (i.e., RNAtranscribed from the inserted nucleic acid molecule will be of anantisense orientation to a target nucleic acid of interest).

The nucleic acid sequences can also be used to compare with endogenousDNA sequences in patients to identify one or more of the disordersdescribed above, and as probes, such as to hybridize and discoverrelated DNA sequences or to subtract out known sequences from a sample.The nucleic acid sequences can further be used to derive primers forgenetic fingerprinting, to raise anti-polypeptide antibodies using DNAimmunization techniques, and as an antigen to raise anti-DNA antibodiesor elicit immune responses. Portions or fragments of the nucleotidesequences identified herein (and the corresponding complete genesequences) can be used in numerous ways, such as polynucleotidereagents. For example, these sequences can be used to: (i) map theirrespective genes on a chromosome; and, thus, locate gene regionsassociated with genetic disease; (ii) identify an individual from aminute biological sample (tissue typing); and (iii) aid in forensicidentification of a biological sample. Additionally, the nucleotidesequences of the invention can be used to identify and expressrecombinant polypeptides for analysis, characterization or therapeuticuse, or as markers for tissues in which the corresponding polypeptide isexpressed, either constitutively, during tissue differentiation, or indiseased states. The nucleic acid sequences can additionally be used asreagents in the screening and/or diagnostic assays described herein, andcan also be included as components of kits (e.g., reagent kits) for usein the screening and/or diagnostic assays described herein.

Kits (e.g., reagent kits) useful in the methods of diagnosis comprisecomponents useful in any of the methods described herein, including forexample, hybridization probes or primers as described herein (e.g.,labeled probes or primers), reagents for detection of labeled molecules,restriction enzymes (e.g., for RFLP analysis), allele-specificoligonucleotides, antibodies which bind to altered or to non-altered(native) TCF7L2 polypeptide, means for amplification of nucleic acidscomprising a TCF7L2 nucleic acid or for a portion of TCF7L2, or meansfor analyzing the nucleic acid sequence of a TCF7L2 nucleic acid or foranalyzing the amino acid sequence of a TCF7L2 polypeptide as describedherein, etc. In one aspect, the kit for diagnosing a susceptibility totype II diabetes can comprise primers for nucleic acid amplification ofa region in the TCF7L2 nucleic acid comprising the marker DG10S478, theSNP rs12255372, rs895340, rs11196205, rs7901695, rs7903146, rs12243326and/or rs4506565, or an at-risk haplotype that is more frequentlypresent in an individual having type II diabetes or who is susceptibleto type II diabetes. The primers can be designed using portions of thenucleic acids flanking SNPs that are indicative of type II diabetes.

Vectors and Host Cells

Another aspect of the invention pertains to nucleic acid constructscontaining a nucleic acid molecules described herein and the complementsthereof (or a portion thereof). The constructs comprise a vector (e.g.,an expression vector) into which a sequence of the invention has beeninserted in a sense or antisense orientation. As used herein, the term“vector” refers to a nucleic acid molecule capable of transportinganother nucleic acid to which it has been linked. One type of vector isa “plasmid”, which refers to a circular double stranded DNA loop intowhich additional DNA segments can be ligated. Another type of vector isa viral vector, wherein additional DNA segments can be ligated into theviral genome. Certain vectors are capable of autonomous replication in ahost cell into which they are introduced (e.g., bacterial vectors havinga bacterial origin of replication and episomal mammalian vectors). Othervectors (e.g., non-episomal mammalian vectors) are integrated into thegenome of a host cell upon introduction into the host cell, and therebyare replicated along with the host genome. Expression vectors arecapable of directing the expression of genes to which they are operablylinked. In general, expression vectors of utility in recombinant DNAtechniques are often in the form of plasmids. However, the invention isintended to include such other forms of expression vectors, such asviral vectors (e.g., replication defective retroviruses, adenovirusesand adeno-associated viruses) that serve equivalent functions.

In certain aspects, recombinant expression vectors of the inventioncomprise a nucleic acid molecule of the invention in a form suitable forexpression of the nucleic acid molecule in a host cell. This means thatthe recombinant expression vectors include one or more regulatorysequences, selected on the basis of the host cells to be used forexpression, which is operably linked to the nucleic acid sequence to beexpressed. Within a recombinant expression vector, “operably linked” or“operatively linked” is intended to mean that the nucleotide sequence ofinterest is linked to the regulatory sequence(s) in a manner whichallows for expression of the nucleotide sequence (e.g., in an in vitrotranscription/translation system or in a host cell when the vector isintroduced into the host cell). The term “regulatory sequence” isintended to include promoters, enhancers and other expression controlelements (e.g., polyadenylation signals). Such regulatory sequences aredescribed, for example, in Goeddel, “Gene Expression Technology”,Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990).Regulatory sequences include those which direct constitutive expressionof a nucleotide sequence in many types of host cell and those whichdirect expression of the nucleotide sequence only in certain host cells(e.g., tissue-specific regulatory sequences). It will be appreciated bythose skilled in the art that the design of the expression vector candepend on such factors as the choice of the host cell to be transformedand the level of expression of polypeptide desired. The expressionvectors of the invention can be introduced into host cells to therebyproduce polypeptides, including fusion polypeptides, encoded by nucleicacid molecules as described herein.

The recombinant expression vectors of the invention can be designed forexpression of a polypeptide of the invention in prokaryotic oreukaryotic cells, e.g., bacterial cells such as E. coli, insect cells(using baculovirus expression vectors), yeast cells or mammalian cells.Suitable host cells are discussed further in Goeddel, supra.Alternatively, the recombinant expression vector can be transcribed andtranslated in vitro, for example using T7 promoter regulatory sequencesand T7 polymerase.

Another aspect of the invention pertains to host cells into which arecombinant expression vector of the invention has been introduced. Theterms “host cell” and “recombinant host cell” are used interchangeablyherein. It is understood that such terms refer not only to theparticular subject cell but also to the progeny or potential progeny ofsuch a cell. Because certain modifications may occur in succeedinggenerations due to either mutation or environmental influences, suchprogeny may not, in fact, be identical to the parent cell, but are stillincluded within the scope of the term as used herein.

A host cell can be any prokaryotic or eukaryotic cell. For example, anucleic acid molecule of the invention can be expressed in bacterialcells (e.g., E. coli), insect cells, yeast or mammalian cells (such asChinese hamster ovary cells (CHO) or COS cells). Other suitable hostcells are known to those skilled in the art.

Vector DNA can be introduced into prokaryotic or eukaryotic cells viaconventional transformation or transfection techniques. As used herein,the terms “transformation” and “transfection” are intended to refer to avariety of art-recognized techniques for introducing a foreign nucleicacid molecule (e.g., DNA) into a host cell, including calcium phosphateor calcium chloride co-precipitation, DEAE-dextran-mediatedtransfection, lipofection, or electroporation. Suitable methods fortransforming or transfecting host cells can be found in Sambrook, etal., (supra), and other laboratory manuals.

For stable transfection of mammalian cells, it is known that, dependingupon the expression vector and transfection technique used, only a smallfraction of cells may integrate the foreign DNA into their genome. Inorder to identify and select these integrants, a gene that encodes aselectable marker (e.g., for resistance to antibiotics) is generallyintroduced into the host cells along with the gene of interest.Preferred selectable markers include those that confer resistance todrugs, such as G418, hygromycin and methotrexate. Nucleic acid moleculesencoding a selectable marker can be introduced into a host cell on thesame vector as the nucleic acid molecule of the invention or can beintroduced on a separate vector. Cells stably transfected with theintroduced nucleic acid molecule can be identified by drug selection(e.g., cells that have incorporated the selectable marker gene willsurvive, while the other cells die).

A host cell of the invention, such as a prokaryotic or eukaryotic hostcell in culture can be used to produce (i.e., express) a polypeptide ofthe invention. Accordingly, the invention further provides methods forproducing a polypeptide using the host cells of the invention. In oneaspect, the method comprises culturing the host cell of invention (intowhich a recombinant expression vector encoding a polypeptide of theinvention has been introduced) in a suitable medium such that thepolypeptide is produced. In another aspect, the method further comprisesisolating the polypeptide from the medium or the host cell.

Antibodies of the Invention

Polyclonal antibodies and/or monoclonal antibodies that specificallybind one form of the gene product but not to the other form of the geneproduct are also provided. Antibodies are also provided which bind aportion of either the variant or the reference gene product thatcontains the polymorphic site or sites. The term “antibody” as usedherein refers to immunoglobulin molecules and immunologically activeportions of immunoglobulin molecules, i.e., molecules that containantigen-binding sites that specifically bind an antigen. A molecule thatspecifically binds to a polypeptide of the invention is a molecule thatbinds to that polypeptide or a fragment thereof, but does notsubstantially bind other molecules in a sample, e.g., a biologicalsample, which naturally contains the polypeptide. Examples ofimmunologically active portions of immunoglobulin molecules includeF(ab) and F(ab′)₂ fragments which can be generated by treating theantibody with an enzyme such as pepsin. The invention providespolyclonal and monoclonal antibodies that bind to a polypeptide of theinvention. The term “monoclonal antibody” or “monoclonal antibodycomposition”, as used herein, refers to a population of antibodymolecules that contain only one species of an antigen binding sitecapable of immunoreacting with a particular epitope of a polypeptide ofthe invention. A monoclonal antibody composition thus typically displaysa single binding affinity for a particular polypeptide of the inventionwith which it immunoreacts.

Polyclonal antibodies can be prepared as described above by immunizing asuitable subject with a desired immunogen, e.g., polypeptide of theinvention or a fragment thereof. The antibody titer in the immunizedsubject can be monitored over time by standard techniques, such as withan enzyme linked immunosorbent assay (ELISA) using immobilizedpolypeptide. If desired, the antibody molecules directed against thepolypeptide can be isolated from the mammal (e.g., from the blood) andfurther purified by well-known techniques, such as protein Achromatography to obtain the IgG fraction. At an appropriate time afterimmunization, e.g., when the antibody titers are highest,antibody-producing cells can be obtained from the subject and used toprepare monoclonal antibodies by standard techniques, such as thehybridoma technique originally described by Kohler and Milstein, Nature256:495-497 (1975), the human B cell hybridoma technique (Kozbor et al.,Immunol. Today 4: 72 (1983)), the EBV-hybridoma technique (Cole et al.,Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, 1985, Inc., pp.77-96) or trioma techniques. The technology for producing hybridomas iswell known (see generally Current Protocols in Immunology (1994) Coliganet al., (eds.) John Wiley & Sons, Inc., New York, N.Y.). Briefly, animmortal cell line (typically a myeloma) is fused to lymphocytes(typically splenocytes) from a mammal immunized with an immunogen asdescribed above, and the culture supernatants of the resulting hybridomacells are screened to identify a hybridoma producing a monoclonalantibody that binds a polypeptide of the invention.

Any of the many well known protocols used for fusing lymphocytes andimmortalized cell lines can be applied for the purpose of generating amonoclonal antibody to a polypeptide of the invention (see, e.g.,Current Protocols in Immunology, supra; Galfre et al., Nature 266:55052(1977); R. H. Kenneth, in Monoclonal Antibodies: A New Dimension InBiological Analyses, Plenum Publishing Corp., New York, N.Y. (1980); andLerner, Yale J. Biol. Med. 54:387-402 (1981)). Moreover, the ordinarilyskilled worker will appreciate that there are many variations of suchmethods that also would be useful.

Alternative to preparing monoclonal antibody-secreting hybridomas, amonoclonal antibody to a polypeptide of the invention can be identifiedand isolated by screening a recombinant combinatorial immunoglobulinlibrary (e.g., an antibody phage display library) with the polypeptideto thereby isolate immunoglobulin library members that bind thepolypeptide. Kits for generating and screening phage display librariesare commercially available (e.g., the Pharmacia Recombinant PhageAntibody System, Catalog No. 27-9400-01; and the Stratagene SurfZAP™Phage Display Kit, Catalog No. 240612). Additionally, examples ofmethods and reagents particularly amenable for use in generating andscreening antibody display library can be found in, for example, U.S.Pat. No. 5,223,409; PCT Publication No. WO 92/18619; PCT Publication No.WO 91/17271; PCT Publication No. WO 92/20791; PCT Publication No. WO92/15679; PCT Publication No. WO 93/01288; PCT Publication No. WO92/01047; PCT Publication No. WO 92/09690; PCT Publication No. WO90/02809; Fuchs et al., Bio/Technology 9: 1370-1372 (1991); Hay et al.,Hum. Antibod Hybridomas 3:81-85 (1992); Huse et al., Science 246:1275-1281 (1989); and Griffiths et al., EMBO J. 12:725-734 (1993).

Additionally, recombinant antibodies, such as chimeric and humanizedmonoclonal antibodies, comprising both human and non-human portions,which can be made using standard recombinant DNA techniques, are withinthe scope of the invention. Such chimeric and humanized monoclonalantibodies can be produced by recombinant DNA techniques known in theart.

In general, antibodies of the invention (e.g., a monoclonal antibody)can be used to isolate a polypeptide of the invention by standardtechniques, such as affinity chromatography or immunoprecipitation. Apolypeptide-specific antibody can facilitate the purification of naturalpolypeptide from cells and of recombinantly produced polypeptideexpressed in host cells. Moreover, an antibody specific for apolypeptide of the invention can be used to detect the polypeptide(e.g., in a cellular lysate, cell supernatant, or tissue sample) inorder to evaluate the abundance and pattern of expression of thepolypeptide. Antibodies can be used diagnostically to monitor proteinlevels in tissue as part of a clinical testing procedure, e.g., to, forexample, determine the efficacy of a given treatment regimen. Theantibody can be coupled to a detectable substance to facilitate itsdetection. Examples of detectable substances include various enzymes,prosthetic groups, fluorescent materials, luminescent materials,bioluminescent materials, and radioactive materials. Examples ofsuitable enzymes include horseradish peroxidase, alkaline phosphatase,beta-galactosidase, or acetylcholinesterase; examples of suitableprosthetic group complexes include streptavidin/biotin andavidin/biotin; examples of suitable fluorescent materials includeumbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine,dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; anexample of a luminescent material includes luminol; examples ofbioluminescent materials include luciferase, luciferin, and aequorin,and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or³H.

The present invention is now illustrated by the followingExemplification, which is not intended to be limiting in any way.

Exemplification

Described herein is the identification of transcription factor 7-like 2(TCF7L2-formerly TCF4) as a gene conferring risk of type II diabetesthrough single-point association analysis using a dense set ofmicrosatellite markers within the 10q locus.

Methods Icelandic Cohort

The Data Protection Authority of Iceland and the National BioethicsCommittee of Iceland approved the study. All participants in the studygave informed consent. All personal identifiers associated with bloodsamples, medical information, and genealogy were first encrypted by theData Protection Authority, using a third-party encryption system (18).

For this study, 2400 type II diabetes patients were identified who werediagnosed either through a long-term epidemiologic study done at theIcelandic Heart Association over the past 30 years or at one of twomajor hospitals in Reykjavik over the past 12 years. Two-thirds of thesepatients were alive, representing about half of the population of knowntype II diabetes patients in Iceland today. The majority of thesepatients were contacted for this study, and the cooperation rateexceeded 80%. All participants in the study visited the Icelandic HeartAssociation where they answered a questionnaire, had blood drawn and afasting plasma glucose measurements taken. Questions about medicationand age at diagnosis were included. The type II diabetes patients inthis study were diagnosed as described in our previously publishedlinkage study(10). In brief, the diagnosis of type II diabetes wasconfirmed by study physicians through previous medical records,medication history, and/or new laboratory measurements. For previouslydiagnosed type II diabetes patients, reporting of the use of oralglucose-lowering agent confirmed type II diabetes. Individuals who werecurrently treated with insulin were classified as having type IIdiabetes if they were also using or had previously used oralglucose-lowering agents. In this cohort the majority of patients onmedication take oral glucose-lowering agents and only a small portion(9%) require insulin. For hitherto undiagnosed individuals, thediagnosis of type II diabetes and impaired fasting glucose (IFG) wasbased on the criteria set by the American Diabetes Association (ExpertCommittee on the Diagnosis and Classification of Diabetes Mellitus1997). The average age of the type II diabetes patients in this studywas 69.7 years.

Replication Cohorts

The Danish study group was selected from the PERF (ProspectiveEpidemiological Risk Factors) study in Denmark(19). 228 females had beendiagnosed previously with type II diabetes and/or measured >=7 mMglucose. As controls, 539 unaffected (with respect to type II diabetes)females were randomly drawn from the same study cohort.

The PENN CATH study in the US is a cross sectional study of theassociation of biochemical and genetic factors with coronaryatherosclerosis in a consecutive cohort of patients undergoing cardiaccatheterization at the University of Pennsylvania Medical Center betweenJuly 1998 and March 2003. Type II diabetes was defined as history offasting blood glucose≧126 mg/dl, 2-hour post-prandial glucose≧200 mg/dl,use of oral hypoglycemic agents, or insulin and oral hypoglycemic in asubject greater than age 40. The University of PennsylvaniaInstitutional Review Board approved the study protocol and all subjectsgave written informed consent. Ethnicity was determined throughself-report. 361 Caucasian type II diabetes cases were derived from thiscohort. 530 unaffected (with respect to type II diabetes and myocardialinfarction) Caucasian controls were randomly drawn from the same study.

The DNA used for genotyping was the product of whole-genomeamplification, by use of the GenomiPhi Amplification kit (Amersham), ofDNA isolated from the peripheral blood of the Danish and US type IIdiabetes patients and controls.

Genotyping

New sequence repeats (i.e. dinucleotide, trinucleotide, andtetronucleotide repeats)-were identified using the Tandem repeats findersoftware(20) and tested for polymorphicity in 94 controls. The size inbasepairs of the lower allele of the CEPH sample 1347-02 (CEPH genomicsrepository) was subtracted from the size of the microsatellite ampliconand used as a reference. SNP genotyping was carried using direct DNAsequencing (Applied BioSystems) or the Centaurus platform (Nanogen).

Statistical Methods for Association Analysis

For single marker association to type II diabetes, we used a likelihoodratio test to calculate a two-sided p-value for each allele. We presentallelic frequencies rather than carrier frequencies for themicrosatellites employed.

We calculated relative risk (RR) and population attributable risk (PAR)assuming a multiplicative model(16, 17). For the CEPH Caucasian HapMapdata, we calculated LD between pairs of SNPs using the standarddefinition of D′ (21) and R² (22). When plotting all SNP combinations toelucidate the LD structure in a particular region, we plotted D′ in theupper left corner and p-values in the lower right corner. In the LD plotwe present, the markers are plotted equidistantly rather than accordingto their physical positions.

Results Locus-Wide Association Study

We previously reported genome-wide significant linkage to chromosome 5qfor type II diabetes mellitus in the Icelandic population(10); in thesame study, we also reported suggestive evidence of linkage to 10q and12q. To follow up the 10q locus, we used an association approachemploying a high density of genotyped microsatellite markers across a10.5 Mb region (NCBI Build 34: Chr10:114.2-124.7 Mb) corresponding tothis locus. We identified and typed 228 microsatellite markers—i.e. toan average density of one marker every 46 kb (Table 1). All the markerswere typed in 1185 Icelandic type II diabetes patients and 931 unrelatedpopulation controls.

TABLE 1 Location of the 228 genotyped microsatellites on chromosome 10in NCBI Build 34 of the human genome assembly. START: Build 34 AliasChr10 location END: Build 34 Chr10 location D10S1269 114186051 114186276DG10S475 114389853 114390116 D10S168 114410102 114410266 DG10S478114460845 114461228 DG10S479 114475488 114475632 DG10S480 114507574114507829 DG10S481 114542657 114542924 DG10S1624 114545990 114546237DG10S1625 114568323 114568715 DG10S488 114713594 114714008 DG10S1630114770344 114770609 DG10S1631 114778307 114778598 DG10S492 114811884114812269 DG10S494 114852114 114852280 DG10S495 114879344 114879474DG10S496 114919414 114919678 DG10S498 114964123 114964270 DG10S500115024471 115024854 DG10S501 115045332 115045710 DG10S508 115241356115241602 DG10S1634 115267106 115267460 DG10S512 115357290 115357439DG10S514 115400157 115400338 DG10S17 115463773 115464048 DG10S1635115519619 115519900 DG10S520 115536945 115537130 D10S554 115695920115696071 D10S1237 115784580 115784977 DG10S535 115858565 115858720D10S1158 115937134 115937433 DG10S1636 115966165 115966382 DG10S540115983225 115983471 DG10S1637 116025219 116025491 DG10S542 116054130116054255 DG10S1638 116062921 116063264 D10S1776 116140681 116140897DG10S546 116141340 116141590 DG10S547 116173634 116173887 DG10S1639116184720 116184898 DG10S548 116202775 116203174 DG10S550 116288175116288560 D10S562 116304948 116305132 DG10S1640 116344030 116344279DG10S1641 116638155 116638540 DG10S566 116866173 116866431 D10S468116869582 116869674 DG10S567 116904174 116904433 D10S1731 117001692117001870 DG10S573 117070087 117070192 DG10S576 117153566 117153823DG10S578 117196538 117196813 DG10S1644 117206992 117207391 DG10S579117226056 117226234 DG10S580 117240674 117240858 DG10S584 117336471117336821 DG10S585 117364742 117364845 DG10S586 117385650 117385816DG10S589 117481892 117482165 DG10S590 117508690 117508966 DG10S591117520912 117521057 DG10S593 117567541 117567800 D10S1748 117589638117589885 DG10S596 117629981 117630119 DG10S597 117654759 117654928DG10S523 117691905 117692329 DG10S598 117691905 117692156 D10S1773117708786 117708989 DG10S599 117713714 117714115 DG10S524 117713997117714115 DG10S600 117742602 117743019 DG10S525 117742701 117742986DG10S1250 117861226 117861405 DG10S604 117867801 117868010 DG10S1293117932494 117932721 DG10S1144 117950298 117950606 DG10S609 118014503118014752 DG10S610 118041410 118041787 DG10S1252 118085912 118086081DG10S612 118092869 118093247 DG10S613 118126058 118126312 DG10S614118150018 118150178 D10S544 118164684 118164979 D10S1683 118211053118211180 D10S1657 118287426 118287695 D10S545 118299618 118299851DG10S1649 118306954 118307121 D10S187 118317655 118317730 DG10S1295118375973 118376205 DG10S624 118401694 118402073 DG10S1203 118440472118440835 DG10S627 118514695 118515072 DG10S1650 118521021 118521210DG10S1681 118522946 118523333 DG10S628 118553693 118553836 DG10S634118566844 118567191 DG10S639 118712208 118712596 DG10S640 118743450118743821 D10S221 118766458 118766560 DG10S1686 118766464 118766561DG10S641 118788135 118788401 DG10S1651 118794961 118795267 DG10S1255118834290 118834438 DG10S644 118857362 118857745 DG10S1652 118862172118862311 DG10S1654 118954536 118954869 DG10S1688 118972583 118972717DG10S1689 118987319 118987480 DG10S1690 119004704 119004986 D10S1425119004742 119004920 DG10S651 119030166 119030595 DG10S1655 119044005119044188 DG10S1691 119078576 119078943 DG10S1207 119094382 119094722D10S1693 119109493 119109731 DG10S1258 119131611 119131788 DG10S656119177278 119177672 DG10S1694 119177430 119177614 DG10S1695 119204432119204655 DG10S657 119204769 119205174 DG10S658 119223917 119224102DG10S1696 119243071 119243408 DG10S1657 119282299 119282586 DG10S1658119290241 119290632 DG10S661 119305067 119305226 DG10S662 119317406119317660 DG10S663 119330718 119331131 DG10S1699 119364904 119365188DG10S665 119396863 119397144 DG10S1659 119412611 119412992 DG10S667119448478 119448736 DG10S1701 119473676 119473914 D10S1236 119473739119473870 DG10S669 119485378 119485552 DG10S670 119505799 119505905D10S190 119510348 119510554 DG10S1702 119510362 119510479 DG10S1153119526060 119526329 DG10S673 119606691 119606963 DG10S1305 119615268119615484 DG10S675 119659153 119659532 DG10S1661 119663175 119663453DG10S1662 119700563 119700948 DG10S1306 119703996 119704204 DG10S1663119783538 119783739 DG10S1704 119783569 119783694 DG10S631 119788517119788678 D10S1148 119803465 119803663 D10S1150 119803465 119803662D10S503 119803476 119803653 DG10S632 119811193 119811621 DG10S681119811347 119811621 DG10S633 119833701 119833987 D10S2473 119833724119833869 DG10S682 119838539 119838806 DG10S683 119853558 119853862DG10S684 119880412 119880572 DG10S685 119909682 119910062 DG10S686119923527 119923790 DG10S687 119954835 119955083 DG10S1212 119972358119972707 DG10S1261 119995566 119995727 DG10S1350 120004924 120005036DG10S1 120030830 120031131 DG10S693 120100794 120101005 DG10S1263120132349 120132528 D10S542 120417003 120417230 DG10S1664 120444685120444808 DG10S1163 120506796 120507066 DG10S703 120538236 120538484DG10S704 120570334 120570593 DG10S706 120642052 120642312 DG10S708120699520 120699811 DG10S709 120723780 120724158 D10S1701 120849161120849428 DG10S716 120893782 120894153 DG10S1669 120969521 120969659DG10S720 121016792 121017048 D10S1792 121042408 121042574 DG10S722121070320 121070693 DG10S1181 121101362 121101685 DG10S724 121117025121117286 DG10S1670 121162511 121162898 DG10S726 121217327 121217580DG10S1167 121247552 121247838 DG10S729 121283257 121283429 DG10S730121318865 121319131 DG10S731 121342622 121342893 DG10S1278 121384227121384464 DG10S734 121425229 121425633 DG10S735 121446549 121446695DG10S1185 121466936 121467248 DG10S1129 121472295 121472600 DG10S1085121494260 121494657 DG10S1327 121526700 121526830 DG10S1271 121559895121560066 DG10S741 121638254 121638391 DG10S1087 121647884 121648273DG10S1359 121713760 121713892 DG10S1120 121726128 121726519 DG10S1671121750886 121750993 DG10S1673 121823695 121823925 DG10S749 121841816121841997 DG10S1134 121901381 121901668 DG10S1674 121931406 121931809DG10S755 121976143 121976435 D10S1757 121989325 121989539 D10S209121995173 121995376 DG10S757 122029990 122030248 DG10S1283 122045222122045429 DG10S1191 122071761 122072115 DG10S761 122141102 122141322DG10S1678 122146312 122146535 DG10S762 122167889 122168135 DG10S763122185793 122185925 DG10S1284 122207287 122207508 DG10S1137 122220809122221073 DG10S766 122257534 122257929 DG10S767 122283871 122284250DG10S1361 122318975 122319081 DG10S1680 122390160 122390294 D10S1230122407279 122407403 DG10S772 122421708 122421845 DG10S775 122463781122463941 DG10S777 122524358 122524547 DG10S779 122580228 122580603DG10S784 122719087 122719236 D10S1483 122948181 122948324 D10S587124728937 124729112

Single marker association analysis with the microsatellite markersidentified association with DG10S478 (Table 2 and the FIGURE).

TABLE 2 DG10S478 Association to Type II Diabetes in Iceland Affectedfreq Control freq Allele (n = 1185) (n = 931) RR [95% CI] Two sided P  00.636 0.724 0.67 2.1 × 10⁻⁹  4 0.005 0.002 2.36 0.12   8 0.093 0.0781.21 0.090 12 0.242 0.178 1.48 4.6 × 10⁻⁷ 16 0.022 0.015 1.53 0.076 200.001 0.003 0.39 0.17  X 0.364 0.276 1.50 [1.31, 1.71] 2.1 × 10⁻⁹

Six alleles are observed with this tetra-nucleotide repeat, with alleles0, 8 and 12 accounting for 98% of chromosomes in the populationcontrols. Allele 0 showed a protective association (Relative Risk(RR)=0.67; P=2.1×10⁻⁹) relative to the other alleles combined. ThisP-value is two-sided and takes into account that some of the patientsare related to each other. DG10S478 is located in intron 3 of thetranscription factor 7-like 2 (TCF7L2—formerly TCF4) gene on 10q25.2.This marker is within a well defined LD block of 74.9 kb (based on theCEPH Caucasian HapMap Phase II) that encapsulates part of intron 3, thewhole of exon 4 and part of intron 4 (the FIGURE).

When DG10S478 was genotyped in the CEPH Caucasian HapMap families, itbecame clear that allele G of SNP rs12255372, is observed to be nearlyperfectly correlated with allele 0 of DG10S478 (r²=0.95, P=5.53×10⁻³⁸),and allele T of rs12255372 is correlated with other alleles of DG10S478.Moreover, the risk conferred by alleles 8 and 12 of DG10S478 do notdiffer (P=0.3). Hence it is natural to collapse all the non-0 alleles ofDG10S478 into a composite allele which will be referred to as allele X.Allele X has frequency of 27.6% and 36.4% in controls and patientsrespectively. Assuming a multiplicative model (16, 17), compared to therisk for non-carriers, allele X has an estimated RR of 1.50 per copycarried.

Replication of the DG10S478 Association to Type II Diabetes

To verify the association of DG 10S478 to type II diabetes, themicrosatellite was genotyped in a Danish type II diabetes cohort of 228cases and 539 controls. The Danish cohort was selected from the PERF(Prospective Epidemiological Risk Factors) study in Denmark (19). Thisfemale type II diabetes cohort had been diagnosed previously with typeII diabetes. The association observed in Iceland was replicated (Table3).

TABLE 3 DG10S478 Association to Type II Diabetes in Denmark Affectedfreq Control freq Allele (n = 228) (n = 539) RR [95% CI] Two sided P 00.669 0.740 0.71 0.0048 4 0.002 0.004 0.59 0.62 8 0.070 0.048 1.49 0.09112 0.239 0.190 1.34 0.032 16 0.020 0.018 1.12 0.78 X 0.331 0.260 1.41[1.11, 1.79] 0.0048

The composite at-risk allele X has a frequency of 26.0% in controls and33.1% in type II diabetes cases, giving an estimated RR of 1.41(P=0.0048).

Subsequently, the microsatellite was genotyped in a US Caucasian type IIdiabetes cohort of 361 cases and 530 controls from the PENN CATH study.This study is a cross sectional study of the association of biochemicaland genetic factors with coronary atherosclerosis in a consecutivecohort of patients undergoing cardiac catheterization at the Universityof Pennsylvania Medical Center. Type II diabetes was defined as ahistory of fasting blood glucose≧126 mg/dl, 2-hour post-prandialglucose≧200 mg/dl, use of oral hypoglycemic agents, or insulin and oralhypoglycemic in a subject greater than age 40. The association observedin Iceland was also replicated in this population (Table 4).

TABLE 4 DG10S478 Association to Type II Diabetes in the United StatesAffected freq Control freq Allele (n = 361) (n = 530) RR [95% CI] Twosided P −4  0.001 0.000 — — 0 0.615 0.747 0.54 3.3 × 10⁻⁹ 4 0.003 0.0040.73 0.72  8 0.085 0.049 1.79 0.0029 12  0.256 0.180 1.57 1.2 × 10⁻⁴ 16 0.040 0.020 2.07 0.012  X 0.385 0.253 1.85 [1.51, 2.27] 3.3 × 10⁻⁹

The composite at-risk allele X has a frequency of 25.3% in controls and38.5% in type II diabetes cases, giving an estimated RR of 1.85(P=3.3×10⁻⁹). Combining the results from all 3 cohorts using aMantel-Haneszel model (NOTE 3) yields an overall two-sided P of4.7×10⁻¹⁸.

The association of the composite at-risk allele to type II diabetes inthree populations constitutes strong evidence that variants of theTCF7L2 gene contribute to the risk of type II diabetes.

After establishing beyond doubt the association of the allele X to typeII diabetes, we investigated the mode of inheritance more closely. Thedominant model and recessive model can be rejected as the heterozygouscarriers clearly have increased risk relative to the non-carriers(P<1×10⁻⁶) and reduced risk compared to the homozygous carriers(P<0.0001). The multiplicative model provides a better fit, but there isevidence that the risk of the homozygous carriers relative to theheterozygous carriers is greater than that of the risk of theheterozygous carriers relative to the non-carriers. Table 5 providesmodel-free estimates of the relative risks of the heterozygous carriersand homozygous carriers compared to the non-carriers.

TABLE 5 Model-free estimates of the relative risks Genotype RelativeRisk Cohort 00 0X [95% CI] XX [95% CI] PAR Iceland 1 1.41 [1.17, 1.70]2.27 [1.70, 3.04] 0.21 Denmark 1 1.37 [0.98, 1.90] 1.92 [1.13, 3.26]0.17 USA 1 1.64 [1.23, 2.19] 3.29 [2.13, 5.07] 0.28 Combined 1 1.45[1.26, 1.67] 2.41 [1.94, 3.00] 0.21

The three cohorts have similar population frequency for the at-riskallele, but the RR estimates vary; with the strongest effect seen in theUS cohort and the weakest in the Danish cohort. While there is no reasonfor the RR to be identical in the cohorts, it is noted that thedifferences in the estimated relative risks do not quite reachstatistical significance (P>0.05). Combining the results from thecohorts assuming common relative risks, the heterozygous carriers andhomozygous carriers are estimated to have relative risks of 1.45 and2.41 respectively compared to the non-carriers (Table 5). Assuming apopulation frequency of 26% for the at-risk allele, heterozygous andhomozygous carriers make up 38% and 7% of the population respectively.Hence, this variant has enough predictive value to be of clinical use.The corresponding population attributed risk is 21%, which issubstantial from a public health point of view.

It should also be noted that allele X is in excess in impaired fastingglucose (IFG) individuals (fasting serum glucose between 6.1 and 6.9mM). The composite at-risk allele X has a frequency of 27.7% in 1393controls and 37.1% in 278 IFG cases, giving an estimated RR of 1.54(P=1.36×10⁻⁵). Association of SNP markers within exon 4 LD block ofTCF7L2 with type 2 diabetes.

In Table 6 we list microsatellite and SNP markers residing within theexon 4 LD block of TCF7L2. The table contains publically available SNPs,as well as SNPs discovered by sequencing the entire LD block region. Thetable furthermore provides polymorphic microsatellite markers residingwithin the block.

TABLE 6 Polymorphic markers residing within the exon 4 LD block ofTCF7L2 (between markers rs4074720 and rs7087006, positions in Build 34co-ordinates: rs4074720 (B34: 114413084) - rs7087006 (B34: 114488013) =74929 bp. Sequence identification references are indicated asappropriate, referring in each instance to the SEQ ID number for theamplimer containing the polymorphism, and forward and reverse primers,as disclosed in the Sequence listing. A. Public SNPs (including allHapMap ethnicities) Chromosome 10 Public Alias B34 location Base ChangeSequence ID NO: rs4074720 114413084 A/G rs4074719 114413145 C/Trs4074718 114413204 C/T rs11196181 114413605 A/G rs11196182 114414744C/T rs4603236 114414765 G/T rs7922298 114414856 C/T rs17747324 114417090C/T rs7901695 114418675 C/T 17-19 rs11196185 114420079 C/T rs4132115114420083 A/C rs4506565 114420628 A/T 14-16 rs7068741 114420845 C/Trs7069007 114420872 C/G rs7903146 114422936 C/T 11-13 rs11196187114424032 A/G rs7092484 114425520 A/G rs10885402 114426284 A/Crs12098651 114426306 A/G rs6585198 114426824 A/G rs7910244 114427209 C/Grs12266632 114429546 C/G rs6585199 114429758 A/G rs7896811 114431304 C/Trs6585200 114433196 A/G rs6585201 114433370 A/G rs4319449 114433993 G/Trs12220336 114434854 A/G rs7896091 114436550 A/G rs12354626 114437016A/G rs7075199 114437307 C/G rs7904519 114438514 A/G rs13376896 114441336A/C rs10885405 114442257 C/T rs10885406 114442311 A/G rs11196192114446874 G/T rs6585202 114447390 C/T rs7924080 114451599 C/T rs7907610114451677 A/G rs12262948 114452313 C/G rs12243326 114453402 C/T  8-10rs12265110 114453606 C/T rs7077039 114453664 C/T rs11196198 114456472A/G rs12775336 114459590 G/T rs7904948 114459672 A/T rs7100927 114460635A/G rs11196199 114460704 A/G rs17685538 114462058 C/G rs11592706114463573 C/T rs7081912 114463678 A/G rs7895340 114466112 A/G 23-25rs11196200 114466525 C/G rs11196201 114467894 A/T rs11196202 114470254A/G rs11196203 114470447 A/C rs11196204 114470518 A/G rs11196205114471634 C/G 20-22 rs10885409 114472659 C/T rs12255372 114473489 G/T5-7 rs12265291 114474827 C/T rs7904443 114475774 A/G rs11196208114475903 C/T rs7077247 114476658 C/T rs11196209 114477314 A/G rs4077527114477628 A/G rs12718338 114477634 C/T rs11196210 114478558 C/Trs7907632 114481823 A/G rs7071302 114482114 G/T rs12245680 114484778 C/Trs11196213 114486141 C/T rs4918789 114486394 G/T rs7085785 114487050 C/Trs7085989 114487326 A/G rs7087006 114488013 A/G B. Novel SNPs discoveredand subsequently validated in the exon 4 LD block of TCF7L2 (amplimersbelow): Chromosome 10 deCODE Alias B34 location Base Change Sequence IDNO: SG10S405 114418658 C/T 26-28 SG10S428 114421901 A/C 29-31 SG10S422114457824 A/G 32-34 SG10S427 114463480 A/T 35-37 SG10S408 114466074 A/T38-40 SG10S409 114471574 A/C 41-43 SG10S406 114471618 C/G 42-44 SG10S407114473534 C/G 45-47 C. Polymorphic microsatellites within the exon 4 LDblock of TCF7L2 (amplimers below): Sequence ID Microsatellite C10 B34Start C10 B34 End NO: DG10S2164 114460344 114460627 48-50 DG10S478114460845 114461228 2-4 DG10S479 114475487 114475632 51-53

TABLE 7 Amplimers and primers for selected markers within the exon 4 LDblock of TCF7L2 >DG10S478TTCAGGCCATTGGTGTTGTATATATTTCAAGATTTGCTCACAGGTCCAAAGCTTAACTTAAGCTCCCTGAGACATATCATAAAATATGATTTGGGGAAAAACCCTAATGGGCCATGATCAGAACATTATTATTCAACAAAGGATGAAATGCTTAAGCCAAGATGGCCTTCTTTCTTTCTTTCTTTCTTTCTTTTTTTTTAATGAAAGTTGAGCAGACTCCCGTCCAACAGTTTTCAATGTAGGAATTCCCACAGCCCCATTTGATTGCAGTTTGTTGAAAAGTTTAATGTTTTTGTAGGCAATTCATAATTTCCACATTGAACAGCCTGAGAGGAAGAGAGCTGGAGCCCACTGTTGTTTTTGTAGTGGGATGGTGGGAACTTT (SEQ ID NO: 2) Primers: F:TTCAGGCCATTGGTGTTGTA (SEQ ID NO: 3) R: AAAGTTCCCACCATCCCACT (SEQ ID NO:4) >rs12255372 TTGTCCCTTGAGGTGTACTGGAAACTAAGGCGTGAGGGACTCATAGGGGTCTGGCTTGGAAAGTGTATTGCTATGTCCAGTTTACACATAAGGATGTGCAAATCCAGCAGGTTAGCTGAGCTGCCCAGGAATATGCAGGCAAGAAT KACCATATTCTGATAATTACTCAGGCCTCTGCCTCATCTCCGCTGCCCCCCCGCCCCCTGACTCTCTTCTGAGTGCCAGATTCAGCCTCCATTTGAATGCCAAATAGACAGGAAATTAGCATGCCCAGAATCCACGTCTTTAGTGCACTCTCTGCCCAGCTCCAAACCTGTTACTGCTTGTGTTCAACATCTCAGTAAAGC TCAACAACATCGACCCATT(SEQ ID NO: 5) Primers: F: TTGTCCCTTGAGGTGTACTGG (SEQ ID NO: 6) R:AATGGGTCGATGTTGTITGAG (SEQ ID NO: 7) >rs12243326GCTGTGAAATCCCCTGTGTAGTGGGAAGAAGAAATAGCAAATCTTAGCTGCCTTGGACCTGATATAATTATTTGTCTTCATTTACATGGTT YATCCTTCAAGGTTGAATAAATGATGTGGGAGCTAGTCAAGGGGCTTTAGGTATGTGATTTCATGCCTACTTTTTTTTAGGTAGAGAAACTGAGGTCACAGGGTACTAGAGAATGGACTCTAAGATTCAGGTTTCTGAATTGCCTGTGGTTTTGTTGACTCAACTGCTCTTCTGTTGTTTTTTAGCCACATGCCTTGAAACAGTCCTCTTTCCCATGTTTCTTCATCAGCACCATTAACCCAAGGTATACTGTCCTCTCTTATCTTTCACAAGGTCTTGGAGTTCCCATGCCTTTGTAAGCATCCCTCCCCGAGATTCAGCACCAACCAAAATCACATTTGGAAAAATTGCTTGTTTCCCAAGAAGCTTTGGAGGATATGATTTTGTATAGAACGGGTTCACAGGTTTTCTGTTCATTCTTCTATGGTGGAGTGTGTGTGTATGTGACTCT GTCTTCTCTCCATTCC (SEQID NO:8) Primers: F: GCTGTGAAATCCCCTGTGTAG (SEQ ID NO: 9) R:GGAATGGAGAGAAGACAGAGTCA (SEQ ID NO: 10) >rs7903146AAGGGAGAAAGCAGGATTGAGCAGGGGGAGCCGTCAGATGGTAATGCAGATGTGATGAGATCTCTGCCGGACCAAAGAGAAGATTCCTTTTTAAATGGTGACAAATTCATGGGCTTTCTCTGCCTCAAAACCTAGCACAGCTGTTATTTACTGAACAATTAGAGAGCTAAGCACTTTTTAGATA YTATATAATTTAATTGCCGTATGAGGCACCCTTAGTTTTCAGACGAGAAACCACAGTTACAGGGAAGGCAAGTAACTTAGTCAATGTCAGATAACTAGGAAAAGGTTAGAGGGGCCCTGGACACAGGCCTGTGTGACTGAGAAGCTTGGGCACTTCACTGCTACATTTCATCTCTTCGCT (SEQ ID NO: 11) Primers: F:AAGGGAGAAAGCAGGATTGA (SEQ ID NO: 12) R: AGCGAAGAGATGAAATGTAGCA (SEQ IDNO: 13) >rs4506565 CTGATGAGGGTAGGGAGCATCTGTCTGCAGCTTCATCTTCATTGTCTAGGGGCTCCAGAAATATCTGTGAGTAAATAAGTTATTTAATCTTTGCCTCAAATTTCCAGTGACTGTAGGGATATAGCTGTGAGCCTCTAGGAGCTGAGATTTTTTAAATTTCCCACTTAAACATTTATTTAAAAATTTTGTGCTCAGCATGGACTAAGGACTTTACATTCATTAACTCATTTACAGCTTGATCCTATGCGGTGGGCATTCATTTACAGAGGATCCCATTTTACAGGTGAGGAAGAGGCCAGCTAGGGGTGCAGCCTAGGTTAGTATTCTAGAGCTCATCAGGCTGTGTTGTCCCCAGTGAAAGAATAAGCAAAGAAGTGAATGTTGTGCATTGAGAAAAATGACTCTCGGAGGAGGATGAGCGTCTCGGATATGGCGACCGAAGTGAT WTGGGGCCCTTGTCAAGGGTCTCTATTATGGCATCAAGAAAAGATGCTGCTTTCGGTGATGCCCGAGGAGAGCCTCAATATTTTACATGGGAAACCTAAAAAAGGGGCCATGTTGTGGTCTCTGCACCTAAGA (SEQ ID NO: 14) Primers: F:CTGATGAGGGTAGGGAGCA (SEQ ID NO: 15) R: TCTTAGGTGCAGAGACCACAAC (SEQ IDNO: 16) >rs7901695 TATTTAGAAACCATAAAATCCACCTATTTGAGGTGTACAATTGAGTGATTTTCTGTATAGTCACAGATCTGTGCAGTCATCCACACCCTCTAACTCCAGGACATTTTCCTCACCCCCGAGGAGAAACCTCCCTTACCCATTAGCAGTCACTCCTCATTTCCTCTCCCCCCAGCCCCTGGCAATCACTGTGGATTTGCCTGTTCTTGACATTTCATATAAATGGTATCATAAAATCTA YGGGCTTTTGTGTCTGTCTGCTTTCACTTAGCATACGGTTCTCAAGGTTCATCCAGTATTGTAGCATCTATCAGTATGTCATTCCTTTTTATGGCCAAATAATATTTTATTGTATGGATAGACATTTTGTTTATTCATTTATCTGTTTTTGGTTATTATGAGTAACACTACTATGAACATTTTGCACAAATTTTTGTATTGACATGTTTTCATTTCTCCTGGGTATAGTCCTATGAGTGGAATTGCTGG (SEQ ID NO: 17)Primers: F: TATTTAGAAACCATAAAATCCACCTAT (SEQ ID NO: 18) R:CCAGCAATTCCACTCATAGGAC (SEQ ID NO: 19) >rs11196205TTGTCTCCTTTTGTTTCTGCTACTGTGAATGATCCTGTGATGATCATCTTTGTGTGTAAATCTTTGTCCCCTCGCCCCCTCCCCTTTTATTATTTTCTTGGGATAGACCCCAGGACAAAAGGTAGAAAAGAACAAAGTGTTAAAAAATTTCTTGATACATAGCCACAGATTATTTTCCTGAAAGTTCTCAACATTTATAA CTAC SAGCAGTATGTAAGAGAGTTATGGTTGGAATGATTTTAATGTCTCTGGGGAATTTAACAACAAAAAAACTTTAGGCTTCTTTGGAGAGAGACATGCCCTTAACTCCACCCCGCCCTAGAACAGAGACCCAGCCCATCCAAGTCAGCCTCCCCAGGTCCTCCACCTTCAAAACAGGCAAACGAAATCATTTCTTGAATAATTGGTAGGCTTCAAGGTCAGATGTT (SEQ ID NO: 20) Primers: F:TTGTCTCCTTTTGTTTCTGCTAC (SEQ ID NO: 21) R: AACATCTGACCTTGAAGCCTAC (SEQID NO: 22) >rs7895340 TCAGGGACAGTGCATAGGTGTAAAGAAGTTGCTGGTTGGGGGTTCTAATGCAGGTTTCTCCAAAAGTGAATGCCCTGTTAAAAAAAAATTCTTAACAAATATACAGAGATTTTTTTTTTAAAAAAGTGTGACAGTTCTAGACACCTAGAG AGTAAA RTGAAGAAGCCTGTTTTCAGGTITCCCGCCTCCCTGAATTTCCCAGCATGGTCCAGGCTTTGAAATTTATTTATCTGCTTTTGGCAATGGTTGATGGGAATTTCCCACATTTATTTTTTAGCTACAGAGAAAGGACATTATCTTTAAAATCTCTTCGTTGTTCTCTCTCTTTGA (SEQ ID NO: 23) Primers: F: TCAGGGACAGTGCATAGGTG(SEQ ID NO: 24) R: TCAAAGAGAGAGAACAACGAAGA (SEQ ID NO: 25) >SG10s405TATTTAGAAACCATAAAATCCACCTATTTGAGGTGTACAATTGAGTGATTTTCTGTATAGTCACAGATCTGTGCAGTCATCCACACCCTCTAACTCCAGGACATTTTCCTCACCCCCGAGGAGAAACCTCCCTTACCCATTAGCAGTCACTCCTCATTTCCTCTCCCCCCAGCCCCTGGCAATCACTGTGGATTTGCCTG TTCTTGACATTTCATATAAAY GGTATCATAAAATCTATGGGCTTTTGTGTCTGTCTGCTTTCACTTAGCATACGGTTCTCAAGGTTCATCCAGTATTGTAGCATCTATCAGTATGTCATTCCTTTTTATGGCCAAATAATATTTTATTGTATGGATAGACATTTTGTTTATTCATTTATCTGTTTTTGGTTATTATGAGTAACACTACTATGAACATTTTGCACAAATTTTTGTATTGACATGTTTTCATTTCTCCTGGGTATAGTCCTATGAGTGGAATTGCTGGGTCATATAATAAATAACTGTTTAACATTTTGGGGAGCTGCCAAACTTTTAAAACCTTGGGTTCTGTGATGTACCAGTTGTGTTAG GCA (SEQ ID NO: 26)Primers: F: TATTTAGAAACCATAAAATCCACCTAT (SEQ ID NO: 27) R:TGCCTAACACAACTGGTACATC (SEQ ID NO: 28) >SG10S428TGCCAGGGGTTTTATGGTTAATTTTCCTCCATTATGAGGGTTGACTCAGCCTTGGGTATTAGATGTCTTTGAGAATCCAGGGTTCAAATACCACAGCTGGTAGAATGTTTCTCAACTTGGAGCCAATCTCCATCTACTGAAGGTACGCTGGTTTAGACAGACAACAGGGACATCAGCATTTTAAAAAGCGGTGGAAAAAGTTTGCTTGTCTTGATTGGAGCCATGACATTTTATTTTGAAATTTCAAATAACATGAAGGGAGGTTTGGAGCGGTTTTTGGTTTATCCAAAGGGCAGTGGATTGAAGGCTGAGAAACACCAGGCTGAATGGGAGAGGGGTTGGGGTCCCCCTGTGAGATAGTGAAACAATGGTAGTGCCATCCAATGATAGGCACTTTTCTGTCATTCAGAAGCAGAAAGGGGGCCAGAGGCCCATTGGCCTTACTGGG MAGTAAGCTGTAGAGCTGCTGCCTTTTCGTGAAAGGGTTGACACCAACCTTCTCCCCCAGGAAGAGTGACCAGGGACCTGAGGGGCATGGTCGAGCAGATGACAGCCTTTGTAAAACATCTCC (SEQ ID NO: 29) Primers: F: TGCCAGGGGTTTTATGGTTA(SEQ ID NO: 30) R: GGAGATGTTTTACAAAGGCTGTC (SEQ ID NO: 31) >SG10S422TTGGTAGAGATGGGGTCTCCTAGGCTGGTCTTGAACTCCTGG RCTCAAGCAATCTTCCTGCCTCAGCCTTCCAAAGTACTGGGATTACTGGCGTGGGCCACCATGCCTGGCTTGAAATTTTTCTATGGCTTTATTCTTTCTCCAAGTACAGAGTCTACCCAACCTTGAGATCTTTGGTTTTCTTTTCCTAGGTAACTATAGTACATACTTATTTATGTTAAACAAGAGCAATCACACATTTCTTTTTCTATACAGTCATGCTTTATAGGCAAATAAAGCCTCCGTCTTAGGCTTTCTGGATTTTTTCAAAAGATGCAATTCCTGGAGTATGTTTTTACTTAGAGCAAAGCAGCCTAGTCTCCTATACCTTCTGCATCTGCAGAAAAGTTGGTTAAACAGACTTTGTAATGATGCCCCTTACAATTCTGAAGGGACTTGTGAAATAGTTTCACAGAGTTTCAGTGTTAGGTATATTTGATCAATGCTAACTTTTGGAAAAGTTTGGTGCCTGTATGATTCAGAGGGTAGGGCAGAATATTAAATTAATCACAACTTCTTGTATTTTAACCATTCTGGGTAAATTGGGATTCC GTGACGCCCAGGCAAAATTAT(SEQ ID NO: 32) Primers: F: TTGGTAGAGATGGGGTCTCC (SEQ ID NO: 33) R:ATAATTTTGCCTGGGCGTCA (SEQ ID NO: 34) >SG10S427TATCTTATATCCCCTCCAAGCATTCATTAACTGATGGATTAGTGAGTTGGCCTTGAGAAGCATAAAGGCTCGTCTCCATGTGCTTCTAAGCATTGTGTCTAAGTTCTGTTTGGTTTCCTGAGTGAAACTGTCTTAATGTTACCAACAGAA GTTAAATGCCTAAGAG WTTCTTATACATGGGCTGAGTACCTCTGTGACTGGGCAAGCCACCTCACCTCATTTTACCTTGTCTGCAAAATGAGGAACTGGGTCAACTCATCGTTCAAATCTCACTGAAAGCTAATTGATCGCTTTTGACAGAAGTAGCTCCCTTGGGCCGTATATTTATTTCCTAGCTTGGAGGAAGGTGGGGACAGACAGAATTGATGTACACCTTTATTTTTATCTCTATGGTAAACCTGTGCATACTAAAGCATTCCTCTGGTCTTTTGAGATGAGTGTATACATTGTGTCTGGCCCTGTGCATTTTTTACCAAGAAGTAAGTTTTGTTGAGTAAACTTGGGTTGTATGAAGAACTGCATGCTCACCGTACTCAAGTAGCTTTTGCTACCTAAAGGACAGCTGCTCATATGTACTTGACTTCCTTTAAAGTGAAGGATGATGACATTTGAAAAAC GGAGGTTGAAAAGGAG (SEQID NO: 35) Primers: F: TATCTTATATCCCCTCCAAGCATTC (SEQ ID NO: 36) R:CTCCTTTTCAACCTCCGTTTT (SEQ ID NO: 37) >SG10S408TTGAGCATGTGTTATTTAATGAGTTATACGTCTGTCATATGTGTGTGTTTATATCACAAAATAACTTATTTTTATAAAACCATATTTTGAGTCATCATTTGTGACAATGTCTTCTTTTCTCTGGTATAAATGAGGCATGTAGAAAGAAGATTGACATTTGCTAGAAGCTTCCCCTTTCCTCTAACTCCACAATAAAATGGATGCTCATAATTACATCTGCTCCTATAAGGTCAAGATTTCAGGGCTGGAAGTGACCTTAGATCATTTAGGCCCAACTTGCCCTCAGGAAAGGAAACTGAGGCCCAGAGATGCCTTAAGTGAATTGCCCAATGTCACACGCTGAGTCAGTGGCCAGAGCAAGGCTTGGATCCAGTTCTCTGCTCCCTTTCCAGAGCCTTGTGATGTCTTCTCTCCTACAGGAGGTGAAAATAACTGCTGTGGCTGGTTCTGTTTTGCTGACTGTAAATTGGGTGATGGTCAGGGACAGTGCATAGGTGTAAAGAAGTTGCTGGTTGGGGGTTCTAATGCAGGTTTCTCCAAAAGTGAATGCCCTGTnTAAAAAAAAATTCTTAACAAATATACAGAGATTTTTTTTT WAAAAAAGTGTGAGAGTTCTAGACACCTAGAGAGTAAAGTGAAGAAGCCTGTTTTCAGGTTTCCCGCCTCCCTGAATTTCCCAGCATGGTCCAGGCTTTGAAATTTATTTATCTGCTTTTGGCAATGGTTGATGGGAATTTCCCACATTTATTTTTTAGCTACAGAGAAAGGACATTATCTTTAAAATCTCTTCGTTGTTCTCTCTCTTTGAGTGAGGAGAGAAGATGTGAATCCTGGCAGTGGTTCAGAGTGGACACAGCCCCTGTGTTTGTGGCATAGGCTCTGTGGGCCCCATGCCAGGGAGCAGTACCCCCGTGTAAAGGAGTGGGGGTTTGTCCATTTGGATAGAGCAAAGATCCTCCACCTCAAATCCCACAAGAACAGTTGCCACAACCTGGGCCCTAAGCATCTCATTTTCCTATGTAGAAATTAATGATCTGGAGGAGATGGCAAAACATTCCTTCCAGAGCCTGTGTGGATTTTGG (SEQ ID NO: 38) Primers: F:TTGAGCATGTGTTATTTAATGAGTTA (SEQ ID NO: 39) R: CCAAAATCCACACAGGCTCT (SEQID NO: 40) >SG10S409 TAGTGCTCAGTATTTCCAACGTTCTGTTTATTTAAGATGAAAATTGCTGTAGTTAATAAGCACTTCCCCATGTCATTAAAATGCTTAAGGATTTTTAATGACCACATAACAGTCCATAATATGATTAAACCCCAATTTACTGAATCAATGCCATATTGTTGGGTCTTTAGATTGTCTCCTTTTGTTTCTGCTACTGTGAATGATCCTGTGATGATCATCTTTGTGTGTAAATCTTTGTCCCCTCGCCCCCTCCCCTTTTATTATTTTCTTGGGATAGACCCCAGGACAAAAGGTAGAAAA GAACAAAGTGTTAAA MAATTTCTTGATACATAGCCACAGATTATTTTCCTGAAAGTTCTCAACATTTATAACTACGAGCAGTATGTAAGAGAGTTATGGTTGGAATGATTTTAATGTCTCTGGGGAATTTAACAAGAAAAAAACTTTAGGCTTCTTTGGAGAGAGACATGCCCTTAACTCCACCCCGCCCTAGAACAGAGACCCAGCCCATCCAAGTCAGCCTCCCCAGGTCCTCCACCTTCAAAACAGGCAAACGAAATCATTTCTTGAATAATTGGTAGGCTTCAAGGTCAGATGTT (SEQ ID NO: 41) Primers: F:TAGTGCTCAGTATTTCCAACGTTCT (SEQ ID NO: 42) R: AACATCTGACCTTGAAGCCTACC(SEQ ID NO: 43) >SG10S406TAGTGCTCAGTATTTCCAACGTTCTGTTTATTTAAGATGAAAATTGCTGTAGTTAATAAGCACTTCCCCATGTCATTAAAATGCTTAAGGATTTTTAATGACCACATAACAGTCCATAATATGATTAAACCCCAATTTACTGAATCAATGCCATATTGTTGGGTCTITAGATTGTCTCCTTTTGTTTCTGCTACTGTGAATGATCCTGTGATGATCATCTTTGTGTGTAAATCTTTGTCCCCTCGCCCCCTCCCCTTTTATTATTTTCTTGGGATAGACCCCAGGACAAAAGGTAGAAAAGAACAAAGTGTTAAAAAATTTCTTGATACATAGCCACAGATTATTTTCCT GAAAGTTCT SAACATTTATAACTACGAGCAGTATGTAAGAGAGTTATGGTTGGAATGATTTTAATGTCTCTGGGGAATTTAACAACAAAAAAACTTTAGGCTTCTTTGGAGAGAGACATGCCCTTAACTCCACCCCGCCCTAGAACAGAGACCCAGCCCATCCAAGTCAGCCTCCCCAGGTCCTCCACCTTCAAAACAGGCAAACGAAATCATTTCTTGAATAATTGGTAGGCTTCAAGGTCAGATGTT (SEQ ID NO: 44) Primers: F:TAGTGCTCAGTATTTCCAACGTTCT (SEQ ID NO: 42) R: AACATCTGACCTTGAAGCCTACC(SEQ ID NO: 43) >SG10S407TGGTATGTCCAGTTTACACATAAGGATGTGCAAATCCAGCAGGTTAGCTGAGCTGCCCAGGAATATCCAGGCAAGAATGACCATATTCTGATAATTACTCAGGCCTCTGCCTCATCTCCGGTG SCCCCCCGCCCCCTGACTCTCTTCTGAGTGCCAGATTCAGCCTCCATTTGAATGCCAAATAGACAGGAAATTAGCATGCCCAGAATCCACGTCTTTAGTGCACTCTCTCCCCAGCTCCAAACCTGTTACTGCTTGTGTTCAACATCTCAGTAAAGCTCAACAACATCGACCCATTACTTAGGCCTCAAACCTTGGGTGGCATCGTCGATTGCTCTTTTCTTTCATACCCCACATTCAACCCATCAGCCCATCCCACAGGCCCAAGTGTGTCCTCTCTACCTTCAAAGCGTGTGTGGCATCCACCGCTTATCACCACCTCTGCCATTACCACTGGAGTCCAGTGCCATCATCTCTCACTTGGATGTGGCCAGAGTGTCTTTGCTGGTCTCCTTCTTGCTTCCTACCTTTGTAACAGCCTATCATCTATCTCTGGTCTCCATAGCTCACTCCCATACTTTGAGAGGGCCTTTGAAAGCCTTAGACAGATCATATCACAGACCT CTATACTGAAAGTCGGG(SEQ ID NO: 45) Primers: F: TGCTATGTCCAGTTTACACATAAGG (SEQ ID NO: 46) R:CCCGACTTTCAGTATAGAGGTCTG (SEQ ID NO: 47) >DG10S2164CCATCTGTGGAGCAGAGTCACTGAAAGGAAATACTGGAAATACTGGAAGCCACTTGGTGTTTTATCAAGGATGTGAGGTTTCCTGGCAACTTTGTCGCCATATCATCATCATCATCACCATCATCATCATCATCATCATCATCATCATCATCATCATCATCATCATCTGCCCTTTAAGTTTTCTGCTTGTTTAGAAAAGAAATTTATACAGAGCCCCCAGTAGCAGCTGTAAGGGGGCAGGTTCTTGGAGCAGCCCATCCTCAACATTCTTGCTGCTGATGGAA (SEQ ID NO: 48) Primers: F:CCATCTGTGGAGCAGAGTCA (SEQ ID NO: 49) R: TTCCATCAGCAGCAAGAATG (SEQ ID NO:50) >DG1W5479 TCCACGCAGAGAGGATCTAAATCTGGCTCTTTGCAATTGCCTTCATACATGTGCATACACACCACACACACACACACACACACACACACACACACACACACAGACACATACATATGCACACACCCCGAGTCAATGGAGGACCCTC (SEQ ID NO: 51) Primers:F: TCCACGCAGAGAGGATCTAAA (SEQ ID NO: 52) R: GAGGGTCCTGCATTGAGTCG (SEQ IDNO: 53)

To further investigate the possibility that other marker alleles in theexon 4 LD 50 block of TCF7L2 exhibit a higher correlation with type IIdiabetes than allele X, we used the DG10S478 genotype data generated inthe HapMap CEU samples. The five SNPs from HapMap Phase I with strongestcorrelation to DG10S478 were, in descending order, rs12255372 (r²=0.95),rs7903146 (r²=0.78), rs7901695 (r²=0.61), rs11196205 (r²=0.43), andrs7895340 (r²=0.42). We genotyped these five SNPs in the three cohortsand the correlations between the five SNPs and DG10S478, the lattertreated as a biallelic marker, were very similar to that observed in theCEU samples. All five SNPs showed association to type II diabetes. Whilesome SNPs showed slightly higher estimated relative risks and lowerp-values in one or two of the cohorts, none exhibited strongerassociation to type II diabetes than DG10S478 when the results for allthree cohorts were combined using the Mantel-Haenszel model. However,although rs11196205 and rs7895340 clearly have weaker association totype II diabetes, compared to allele X (RR=1.56, P=4.7×10⁻¹⁸), thestrength of the association to type II diabetes for allele T ofrs12255372 (RR=1.52, P=2.5×10⁻¹⁶) and for allele T of rs7903146(RR=1.54, P=2.1×10⁻¹⁷) are comparable.

Following the subsequent release of HapMap Phase II in October 2005, twoadditional SNPs were identified that show strong correlation tomicrosatellite DG10S478-rs12243326 (r²=0.961) and rs4506565 (r²=0.716).The alleles associated with susceptibility to type 2 diabetes will be Cfor rs12243326 (C/T SNP) and T for rs4506565 (A/T SNP).

It should be noted that among those haplotypes that carry the C alleleof rs7903146, those that carry the A allele of rs10885406 have anestimated relative risk of 1.06 compared to those that carry the Gallele of rs10885406, but the difference is not statisticallysignificant (P=0.22).

In an attempt to replicate and refine this association with type 2diabetes, we genotyped DG10S478, rs12255372 and rs7903146 in a largeadditional Danish cohort, consisting of 1111 cases and 2315 controls andin a more genetically diverse West African cohort, consisting of 618cases and 434 controls derived from the Africa America Diabetes Mellitusstudy(23). In the Danes, all three variants were strongly associatedwith disease risk, as previously observed in Iceland. However, theassociation of allele T of rs7903146 (Relative Risk=1.53, P=4.06×10⁻¹⁴,PAR=24.4%) was noticeably stronger than that provided by the other twovariants. In the West African study group, after adjustment forrelatedness and ethnic origin, we replicated the association of allele Tof rs7903146 to type 2 diabetes (Relative Risk=1.45, 95% C.I.=1.20-1.76,P=0.000146, PAR=22.2%), but not in the case of the other two variants.This suggests that allele T of rs7903146 is either the risk variantitself or the closest known correlate of an unidentified risk variant.The exclusion of the markers DG10S478 and rs12255372 as at-risk markersin the West African group was possible because unlike in populations ofEuropean ancestry, where the T allele of rs7903146 occurs almostexclusively on chromosomes carrying both allele X of DG10S478 and alleleT of rs12255372, in West Africans the T allele of rs7903146 occurs withboth alleles of DG10S478 and rs12255372. This is consistent with theobservation that T is the ancestral allele of rs7903146, whereas alleleX of DG10S478 and allele T of rs12255372 are both different from thechimpanzee reference sequence. More generally, this finding is alsoconsistent with the expectation that relatively diverse populations,such as those of West Africa, provide the means to refine associationsignals detected in regions of strong linkage disequilibrium in morehomogeneous populations.

Discussion

In this study we describe the identification of a novel candidate genefor type II diabetes within the previously reported 10q linkageregion(10), encoding transcription factor 7-like 2 (TCF7L2—formerlyTCF4) on 10q25.2. We show that it confers risk of type II diabetes inIceland, Denmark and the US with similar frequency and relative risks.While the variant does not explain a substantial fraction of thefamilial clustering of type II diabetes, the population attributed riskof at least 20% is significant from a public health point of view.Compared to the non-carriers, the relative risks of heterozygous carrierof the at-risk composite allele (approximately 38% of the population)and homozygous carriers (about 7% of the population) are 1.45 and 2.41,respectively. Hence, this variant has enough predictive value to be ofclinical use.

We report the variant as a type II diabetes-associated microsatellite,DG10S478, within the third intron of the TCF7L2 gene. The TCF7L2 geneproduct is a high mobility group (HMG) box-containing transcriptionfactor which plays a role in the Wnt signalling pathway. This pathway isconsidered one of the key developmental and growth regulatory mechanismsof the cell; it is mediated by secreted glycoproteins, known as Wnts,which initiate many signalling cascades within target cells upon bindingto a cognate receptor complex, consisting of a member of the Frizzledfamily and a member of the LDL receptor family, Lrp5/6(24). Wntsignaling uncouples the central player in this pathway, β-catenin, fromthe degradation complex and translocates it to the nucleus where ittransiently converts TCF factors from repressors into transcriptionalactivators(25). The β-catenin protein is also important for mediatingcell adhesion through its binding of cadherins(15).

The NCBI RefSeq for TCF7L2 contains 14 exons. However, Duval et al(26)showed that TCF7L2 has 17 exons, of which 5 are alternative; inaddition, it was reported that three alternative splice acceptor sitesare used. This study also demonstrated the alternative use of threeconsecutive exons located in the 3′ end of the TCF7L2 gene which changethe reading frames used in the last exon, leading to the synthesis of alarge number of TCF7L2 isoforms with short, medium, or longCOOH-terminal ends.

Similar to TCF7L2, five of the six positionally cloned genes for therare Mendelian forms of Type II Diabetes, namely maturity-onset diabetesof the young (MODY), are transcription factors(27). Additionaltranscription factors have been implicated in the pathogenesis of typeII diabetes, including peroxisome proliferator-activated receptor gamma(PPARγ)(7) and the forkhead gene family(28, 29). Noble et al described amissense mutation (C883A) in the related TCF7 gene in type 1diabetes(30). However, it is not clear if TCF7 and TCF7L2 operate in thesame pathway with respect to the pathogenesis of diabetes.

Mutations have been described in the TCF7L2 gene, including the deletionof an A in an (A)9 coding repeat (exon 17)(26, 31-33) and a number ofmutations in colorectal cell lines(26). DG10S478 resides within aclearly defined 74.9 kb LD block (CEPH Caucasian HapMap Phase II) thatencapsulates exon 4 and flanking intronic sequences 5′ and 3′ to theexon. It is possible that DG10S478 is the causative variant itself; itis also possible that DG10S478 is a surrogate for an underlying variantthat affects transcription, splicing or message stability. Such avariant is likely to be in strong LD with DG10S478, i.e. the variantresides within the exon 4 LD block of TCF7L2

Several lines of evidence suggest an enteroendocrine role of this genein the pathogenesis of type II diabetes. Firstly, TCF7L2 has beenimplicated in the development of colorectal cancer(34) andsmall-molecule antagonists of the oncogenic TCF/β-catenin proteincomplex have been already described(35). In addition, TCF7L2−/− mice,which die within 24 hours after birth, lack an intestinal epithelialstem-cell compartment(36). Variants of the TCF7L2 gene could influencethe susceptibility to type II diabetes through altering levels of theinsulinotropic hormone glucagon-like peptide 1 (GLP-1), one of thepeptides encoded by the proglucagon gene whose expression inenteroendocrine cells is transcriptionally regulated by TCF7L2. Inconcert with insulin, GLP-1 exerts crucial effects on blood glucosehomeostasis(12). GLP-1 analogs and inhibitors of dipeptidyl peptidase IVare currently in clinical development.

The references cited in this specification are incorporated herein intheir entirety.

REFERENCES

-   1. A. F. Amos, D. J. McCarty, P. Zimmet, Diabet Med 14 Suppl 5, SI    (1997).-   2. P. Zimmet et al., Am J Epidemiol 118, 673 (November, 1983).-   3. W. C. Knowler, D. J. Pettitt, M. F. Saad, P. H. Bennett, Diabetes    Metab Rev 6, 1 (February, 1990).-   4. B. Newman et al., Diabetologia 30, 763 (October, 1987).-   5. A. H. Barnett, C. Eff, R. D. Leslie, D. A. Pyke, Diabetologia 20,    87 (February, 1981).-   6. A. L. Gloyn, Ageing Res Rev 2, 111 (April, 2003).-   7. D. Altshuler et al., Nat Genet. 26, 76 (September, 2000).-   8. A. L. Gloyn et al., Diabetes 52, 568 (February, 2003).-   9. Y. Horikawa et al., Nat Genet. 26, 163 (October, 2000).-   10. I. Reynisdottir et al., Am JHum Genet. 73, 323 (August, 2003).-   11. R. Duggirala et al., Am J Hum Genet. 64, 1127 (April, 1999).-   12. F. Yi, P. L. Brubaker, T. Jin, J Biol Chem 280, 1457 (Jan. 14,    2005).-   13. S. E. Ross et al., Science 289, 950 (Aug. 11, 2000).-   14. E. A. Jansson et al., Proc Natl Acad Sci USA 102, 1460 (Feb. 1,    2005).-   15. W. J. Nelson, R. Nusse, Science 303, 1483 (Mar. 5, 2004).-   16. C. T. Falk, P. Rubinstein, Ann Hum Genet. 51 (Pt 3), 227 (July,    1987).-   17. J. D. Terwilliger, J. Ott, Hum Hered 42, 337 (1992).-   18. J. R. Gulcher, K. Kristjansson, H. Gudbjartsson, K. Stefansson,    Eur J Hum Genet. 8, 739 (October, 2000).-   19. Y. Z. R. Bagger, B. J.; Alexandersen, P.; Tanko, L. B.;    Christiansen, C, J Bone Miner Res Suppl 1, 1 (2001).-   20. G. Benson, Nucleic Acids Res 27, 573 (Jan. 15, 1999).-   21. R. C. Lewontin, Genetics 50, 757 (October, 1964).-   22. W. G. Hill, A. Robertson, Genetics 60, 615 (November, 1968).-   23. C. N. Rotimi et al., Ann Epidemiol 11, 51 (January, 2001).-   24. C. Prunier, B. A. Hocevar, P. H. Howe, Growth Factors 22, 141    (September, 2004).-   25. J. Huelsken, W. Birchmeier, Curr Opin Genet Dev 11, 547    (October, 2001).-   26. A. Duval et al., Cancer Res 60, 3872 (Jul. 15, 2000).-   27. S. S. Fajans, G. I. Bell, K. S. Polonsky, N Engl J Med 345, 971    (Sep. 27, 2001).-   28. C. Wolfrum, E. Asilmaz, E. Luca, J. M. Friedman, M. Stoffel,    Nature 432, 1027 (Dec. 23, 2004).-   29. J. Nakae et al., Nat Genet. 32, 245 (October, 2002).-   30. J. A. Noble et al., Diabetes 52, 1579 (June, 2003).-   31. A. Duval et al., Cancer Res 59, 4213 (Sep. 1, 1999).-   32. A. Duval et al., Oncogene 18, 6806 (Nov. 18, 1999).-   33. H. R. Chang et al., Cancer Lett (May 16, 2005).-   34. N. A. Wong, M. Pignatelli, Am J Pathol 160, 389 (February,    2002).-   35. M. Lepourcelet et al., Cancer Cell 5, 91 (January, 2004).-   36. V. Korinek et al., Nat Genet. 19, 379 (August, 1998).

While this invention has been particularly shown and described withreferences to preferred embodiments thereof, it will be understood bythose skilled in the art that various changes in form and details may bemade therein without departing from the scope of the inventionencompassed by the appended claims.

1. A method of diagnosing an increased susceptibility to type IIdiabetes in an individual, comprising detecting a marker or haplotypeassociated with the exon 4 LD block of TCF7L2 in the individual, whereinthe presence of the marker or haplotype is indicative of an increasedsusceptibility to type II diabetes.
 2. The method of claim 1, whereinthe marker or haplotype comprises at least one marker selected from themarkers listed in Table
 6. 3. The method of claim 1, wherein theincreased susceptibility is characterized by a relative risk of at least1.2.
 4. A method of assessing an individual for probability of responseto a TCF7L2 therapeutic agent, comprising: detecting a marker associatedwith the exon 4 LD block of TCF7L2, wherein the presence of the markeris indicative of a probability of a positive response to a TCF7L2therapeutic agent.
 5. The method of claim 4, wherein the marker isselected from the group consisting of DG10S478, rs12255372, rs7895340,rs11196205, rs7901695, rs7903146, rs12243326, and rs4506565.
 6. Themethod of claim 5, wherein the marker is marker DG10S478, and whereinthe presence of a non-0 allele in DG10S478 is indicative of aprobability of a positive response to a TCF7L2 therapeutic agent.
 7. Themethod of claim 5, wherein the marker is marker rs7903146, and whereinthe presence of a T allele in rs7903146 is indicative of a probabilityof a positive response to a TCF7L2 therapeutic agent.
 8. A method ofdiagnosing a decreased susceptibility to type II diabetes in anindividual, comprising detecting a marker or haplotype associated withthe exon 4 LD block of TCF7L2 in the individual, wherein the presence ofthe marker or haplotype is indicative of a decreased susceptibility totype II diabetes.
 9. The method of claim 8, wherein the decreasedsusceptibility is characterized by a relative risk of less than 0.8. 10.A method of detecting an increased susceptibility to type II diabetes inan individual, comprising identifying the presence or absence of anallele at a marker associated with the exon 4 LD block of TCF7L2 in theindividual, wherein identification of the presence of the allele isindicative of increased susceptibility to type II diabetes in theindividual.
 11. The method of claim 10, wherein the marker associatedwith the exon 4 LD block of TCF7L2 is a marker in strong linkagedisequilibrium, characterized by r² greater than 0.2, with the exon 4 LDblock of TCF7L2.
 12. The method of claim 10, wherein the markerassociated with the exon 4 LD block of TCF7L2 is a marker in stronglinkage disequilibrium, characterized by r² greater than 0.2, with oneor more of the markers listed in Table
 6. 13. The method of claim 10,wherein the marker associated with the exon 4 LD block of TCF7L2 isselected from the group consisting of the markers listed in Table
 6. 14.A method of detecting a decreased susceptibility to type II diabetes inan individual, comprising identifying the presence or absence of anallele at a marker associated with the exon 4 LD block of TCF7L2 in theindividual, wherein identification of the presence of the allele isindicative of decreased susceptibility to type II diabetes in theindividual.
 15. The method of claim 14, wherein the marker associatedwith the exon 4 LD block of TCF7L2 is a marker in strong linkagedisequilibrium, characterized by r² greater than 0.2, with the exon 4 LDblock of TCF7L2.
 16. The method of claim 14, wherein the markerassociated with the exon 4 LD block of TCF7L2 is a marker in stronglinkage disequilibrium, characterized by r² greater than 0.2, with oneor more of the markers listed in Table
 6. 17. The method of claim 14,wherein the marker associated with the exon 4 LD block of TCF7L2 isselected from the group consisting of the markers listed in Table
 6. 18.A method of detecting an increased susceptibility to type II diabetes inan individual, comprising detecting an allele at a polymorphismassociated with the exon 4 LD block of TCF7L2 in the individual, whereinidentification of said allele at the polymorphism is indicative ofincreased risk of type II diabetes in the individual.
 19. The method ofclaim 18, wherein the marker associated with the exon 4 LD block ofTCF7L2 is a marker in strong linkage disequilibrium, characterized by r²greater than 0.2, with the exon 4 LD block of TCF7L2.
 20. The method ofclaim 18, wherein the marker associated with the exon 4 LD block ofTCF7L2 is a marker in strong linkage disequilibrium, characterized by r²greater than 0.2, with one or more of the markers listed in Table
 6. 21.The method of claim 18, wherein the marker associated with the exon 4 LDblock of TCF7L2 is selected from the group consisting of the markerslisted in Table
 6. 22. A method of detecting a decreased susceptibilityto type II diabetes in an individual, comprising detecting an allele ata polymorphism associated with the exon 4 LD block of TCF7L2 in theindividual, wherein identification of said allele at the polymorphism isindicative of decreased risk of type II diabetes in the individual. 23.The method of claim 22, wherein the marker associated with the exon 4 LDblock of TCF7L2 is a marker in strong linkage disequilibrium,characterized by r² greater than 0.2, with the exon 4 LD block ofTCF7L2.
 24. The method of claim 22, wherein the marker associated withthe exon 4 LD block of TCF7L2 is a marker in strong linkagedisequilibrium, characterized by r² greater than 0.2, with one or moreof the markers listed in Table
 6. 25. The method of claim 22, whereinthe marker associated with the exon 4 LD block of TCF7L2 is selectedfrom the group consisting of the markers listed in Table 6.